tags:

views:

70

answers:

2

I wonder how is generated the grammar of the Python language and how it is understood by the interpreter.

In python, the file graminit.c seems to implement the grammar, but i don't clearly understand it.

More broadly, what are the different ways to generate a grammar and are there differences between how the grammar is implemented in languages such as Perl, Python or Lua.

+5  A: 

Grammars are generally of the same form: Backus-Naur Form (BNF) is typical.

Lexer/parsers can take very different forms.

The lexer breaks up the input file into tokens. The parser uses the grammar to see if the stream of tokens is "valid" according to its rules.

Usually the outcome is an abstract syntax tree (AST) that can then be used to generate whatever you want, such as byte code or assembly.

duffymo
Small response who raises new questions. I knew the descriptive forms of languages but did not know they were called BNF.I guess the AST tree and its source code is generated by a Lexer. Have you typical exemples?
ohe
All the ones I know are Java based: ANTLR, Bison, JavaCC. Don't know about Python.
duffymo
I mean, have you typical and simple exemples of BNF/AST grammar implementation ..
ohe
ANTLR has a number of grammars on their website, including Python 3: http://antlr.org/grammar/list
duffymo
+1  A: 

There are many ways to implement lexing/parsing, it really comes down to identifing the patterns and how they fit together. There are a few very nice Python packages for doing this that range from pure python to wrapped C code. Pyparsing in-particular has many excellent examples. One thing worth noting, finding a straight EBNF/BNF parser is kind of hard -- writing a parser with Python code isn't awful but it is one step further from the raw grammar which might be important to you.

synthesizerpatel