Designing Bagh: A Bengali Programming Language
Bagh is a Bangla-first programming language I designed that compiles to Python, letting kids and beginners code in their mother tongue. Here's how I built it.
Bagh is a programming language where you write code in Bengali script and it compiles to Python. The name means "tiger" in Bengali — fitting for a language trying to reclaim computing in a mother tongue spoken by 230 million people. Here's how the compiler works.
Why a Bengali programming language?
Programming education in Bangladesh is blocked by a language barrier before it's blocked by a logic barrier. Kids learning to code must simultaneously learn English vocabulary and programming concepts. Bagh removes the English wall.
যদি সংখ্যা > ১০:
দেখাও("বড় সংখ্যা")
অন্যথা:
দেখাও("ছোট সংখ্যা")
Compiles to:
if number > 10:
print("big number")
else:
print("small number")
Compiler architecture
Bagh is a source-to-source compiler (transpiler) targeting Python 3. Four stages:
Bagh source (.bagh)
→ Lexer (tokenize Bengali + ASCII)
→ Parser (AST construction)
→ Semantic analyzer (type hints, scope)
→ Code generator (Python source)
→ Python runtime
I chose Python as the target for three reasons: massive ecosystem, readable output (teachers can inspect the generated code), and zero runtime dependency beyond python3.
Lexer: tokenizing Bengali Unicode
Bengali script in Unicode spans U+0980–U+09FF. The lexer needs to handle:
- Bengali digits (০১২৩৪৫৬৭৮৯) alongside ASCII digits
- Bengali keywords (
যদি= if,অন্যথা= else,এবং= and) - Mixed identifiers (variable names can be Bengali or ASCII)
class Lexer:
KEYWORDS = {
'যদি': TokenType.IF,
'অন্যথা': TokenType.ELSE,
'যখন': TokenType.WHILE,
'দেখাও': TokenType.PRINT,
'ফাংশন': TokenType.DEF,
'ফেরত': TokenType.RETURN,
}
def is_bengali_digit(self, ch: str) -> bool:
return '০' <= ch <= '৯'
def normalize_digit(self, ch: str) -> str:
if self.is_bengali_digit(ch):
return str(ord(ch) - 0x09E6)
return ch
Bengali digits normalize to ASCII before passing to the parser — this keeps the AST simple.
Parser: recursive descent
Bagh uses a hand-written recursive descent parser. I avoided parser generators (PLY, ANTLR) because they don't handle Bengali Unicode well and add unnecessary weight for a teaching language.
class Parser:
def parse_if(self) -> IfNode:
self.expect(TokenType.IF)
condition = self.parse_expression()
self.expect(TokenType.COLON)
body = self.parse_block()
else_body = None
if self.match(TokenType.ELSE):
self.expect(TokenType.COLON)
else_body = self.parse_block()
return IfNode(condition, body, else_body)
The grammar is intentionally a subset of Python's — no metaclasses, no decorators, no walrus operator. Teaching languages should be small.
Code generation
The code generator walks the AST and emits Python. Variable names get transliterated:
def transliterate(name: str) -> str:
# Bengali identifier → safe Python identifier
result = []
for ch in name:
if 'ঀ' <= ch <= '':
result.append(f'_b{ord(ch):04x}')
else:
result.append(ch)
return ''.join(result)
Generated Python is human-readable (hex escapes only for variable names, not strings). A teacher can show students both the Bagh source and the Python output side-by-side.
Online interpreter
bagh-online runs the compiler in the browser via Pyodide (Python in WASM). No server needed — the entire compile + run loop happens client-side. Load time is ~2 seconds (Pyodide bundle) but zero ongoing compute cost.
Current limitations and roadmap
| Feature | Status | |---------|--------| | Variables, if/else, while | ✅ Implemented | | Functions | ✅ Implemented | | Lists and dicts | ✅ Implemented | | Classes | 🚧 In progress | | Import system | ⬜ Planned | | Type annotations | ⬜ Planned | | Error messages in Bengali | ⬜ Planned |
Error messages in Bengali are the next priority — "SyntaxError: unexpected token" is meaningless to a 12-year-old learning in Dhaka.
FAQ
What is the Bagh programming language? Bagh is a Bengali-syntax programming language that compiles to Python. It lets beginners and students code in Bengali script instead of English keywords.
Why does Bagh compile to Python? Python is readable, widely taught, and has a massive ecosystem. Bagh source compiles to human-readable Python so students can learn both languages simultaneously.
Is Bagh open source? Yes. The compiler and online IDE are fully open source on GitHub at shihabshahrier/bagh.
Can I run Bagh in a browser? Yes — the online interpreter at bagh-online runs entirely client-side via Pyodide (Python compiled to WASM).
Who is Bagh designed for? Students and beginners in Bangladesh and other Bengali-speaking regions who find the English vocabulary barrier a bigger obstacle than programming logic itself.
Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. See also: Building World-Class Software From Dhaka.