Why Native Grammars
Integrated tools. Zero dependencies.
See a full worked example on the Comparison page — we juxtapose Python since it is commonly used for parsing tasks, but similar limitations apply to any language without native grammars: Rust, Go, TypeScript, and beyond.
Built In — Not Bolted On
Python needs an external library and a grammar string stored separately from the code. Raku grammars are a first-class language feature — the same syntax you use everywhere.
# Python: external library + grammar-as-string
from lark import Lark
GRAMMAR = r"""
start: word+
word: LETTER+
LETTER: /[a-z]/i
"""
parser = Lark(GRAMMAR)
tree = parser.parse("hello world")
# Raku: grammar is part of the language
grammar WordParser {
token TOP { <word>+ % \s+ }
token word { <letter>+ }
token letter { <[a..zA..Z]> }
}
say WordParser.parse("hello world");
Named Captures — An Instant Parse Tree
Lark builds a tree, but you still navigate it by position — swap two rules and your indices silently break. Raku grammar tokens give every matched part a name, so the parse tree is self-documenting.
from lark import Lark
GRAMMAR = r"""
start: year "-" month "-" day
year: /\d{4}/
month: /\d{2}/
day: /\d{2}/
"""
parser = Lark(GRAMMAR)
tree = parser.parse("2026-05-12")
# navigate the tree by child position
year = tree.children[0].children[0]
month = tree.children[1].children[0]
day = tree.children[2].children[0]
grammar DateParser {
token TOP { <year> '-' <month> '-' <day> }
token year { \d ** 4 }
token month { \d ** 2 }
token day { \d ** 2 }
}
my $m = DateParser.parse("2026-05-12");
say $m<year>; # 「2026」 named, not positional
say $m<month>; # 「05」
say $m<day>; # 「12」
Actions Classes — Parsing Separate from Semantics
In Python you mix tree-walking into the transformer class. Raku keeps the grammar (structure) and actions class (meaning) cleanly apart, so each can evolve independently.
from lark import Lark, Transformer
GRAMMAR = r"""
start: left "+" right
left: /\d+/
right: /\d+/
"""
class CalcActions(Transformer):
def left(self, t): return int(t[0])
def right(self, t): return int(t[0])
def start(self, t): return t[0] + t[1]
parser = Lark(GRAMMAR)
print(CalcActions().transform(parser.parse("3+4")))
# 7
grammar Calc { # structure only
token TOP { <left> '+' <right> }
token left { \d+ }
token right { \d+ }
}
class CalcActions { # meaning only
method TOP($/) { make +$<left> + +$<right> }
}
say Calc.parse("3+4", actions => CalcActions.new).made;
# OUTPUT: 7
Grammar Inheritance — Composable & Extensible
Raku grammars are classes. You can inherit from them and override individual tokens or rules — extend a grammar without touching the original.
from lark import Lark
# no grammar inheritance — copy-paste or
# string manipulation required
BASE_GRAMMAR = r"""
start: word+
word: LETTER+
LETTER: /[a-z]/
"""
EXTENDED = BASE_GRAMMAR + r"""
word: LETTER+ | DIGIT+
DIGIT: /[0-9]/
"""
parser = Lark(EXTENDED)
print(parser.parse("hello 42 world"))
grammar Base {
token TOP { <word>+ }
token word { <[a..z]>+ }
}
grammar Extended is Base {
token word { <[a..z]>+ | <[0..9]>+ } # override one token
}
say Extended.parse("hello 42 world");
# 「hello 42 world」
Unicode Properties — Match Any Language Natively
Python's Lark uses re terminals by default, which are ASCII-only — handling accented letters or non-Latin scripts needs an extra regex flag and a third-party install. Raku grammars understand Unicode categories natively, and all Raku strings are NFG (Normal Form Grapheme) — every
Str counts user-perceived characters, so
"é".chars is
1, not
2. The same grammar parses English, Arabic, Japanese, or emoji without extra dependencies or encoding surprises.
from lark import Lark
# Lark terminals use re by default — ASCII only
GRAMMAR = r"""
start: word+
word: LETTER+
LETTER: /[a-zA-Z]+/ # fails on accented chars
"""
parser = Lark(GRAMMAR)
parser.parse("café résumé") # UnexpectedCharacters
# Unicode: extra flag + pip install regex
GRAMMAR2 = r"""
start: word+
word: LETTER+
LETTER: /\p{L}+/
"""
parser2 = Lark(GRAMMAR2, regex=True)
print(parser2.parse("café résumé"))
grammar NaturalText {
token TOP { <word>+ % \s+ }
token word { <:Letter>+ } # any Unicode letter, NFG-aware
}
# all Raku Str are NFG — "é".chars == 1, not 2
say NaturalText.parse("café résumé");
# 「café résumé」
say NaturalText.parse("日本語 한국어");
# 「日本語 한국어」