I know the bounty has already been claimed, but here is an equivalent parser written in pyparsing (plus support for function calls with zero or more comma-delimted arguments):
from pyparsing import *
LPAR, RPAR = map(Suppress,"()")
EQ = Literal("=")
name = Word(alphas, alphanums+"_").setName("name")
number = Word(nums).setName("number")
expr = Forward()
operand = Optional('-') + (Group(name + LPAR +
Group(Optional(delimitedList(expr))) +
RPAR) |
name |
number |
Group(LPAR + expr + RPAR))
binop = oneOf("+ - * / **")
expr << (Group(operand + OneOrMore(binop + operand)) | operand)
assignment = name + EQ + expr
statement = assignment | expr
This test code runs the parser through its basic paces:
tests = """\
sin(pi/2)
y = mx+b
E = mc ** 2
F = m*a
x = x0 + v*t +a*t*t/2
1 - sqrt(sin(t)**2 + cos(t)**2)""".splitlines()
for t in tests:
print t.strip()
print statement.parseString(t).asList()
print
Gives this output:
sin(pi/2)
[['sin', [['pi', '/', '2']]]]
y = mx+b
['y', '=', ['mx', '+', 'b']]
E = mc ** 2
['E', '=', ['mc', '**', '2']]
F = m*a
['F', '=', ['m', '*', 'a']]
x = x0 + v*t +a*t*t/2
['x', '=', ['x0', '+', 'v', '*', 't', '+', 'a', '*', 't', '*', 't', '/', '2']]
1 - sqrt(sin(t)**2 + cos(t)**2)
[['1', '-', ['sqrt', [[['sin', ['t']], '**', '2', '+', ['cos', ['t']], '**', '2']]]]]
For debugging, we add this code:
# enable debugging for name and number expressions
name.setDebug()
number.setDebug()
And now we reparse the first test (displaying the input string and a simple column ruler):
t = tests[0]
print ("1234567890"*10)[:len(t)]
print t
statement.parseString(t)
print
Giving this output:
1234567890123
sin(pi/2)
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match number at loc 11(1,12)
Matched number -> ['2']
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match number at loc 11(1,12)
Matched number -> ['2']
Pyparsing also supports packrat parsing, a sort of parse-time memoization (read more about packratting here). Here is the same parsing sequence, but with packrat enabled:
same parse, but with packrat parsing enabled
1234567890123
sin(pi/2)
Match name at loc 4(1,5)
Matched name -> ['sin']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 8(1,9)
Matched name -> ['pi']
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match name at loc 11(1,12)
Exception raised:Expected name (at char 11), (line:1, col:12)
Match number at loc 11(1,12)
Matched number -> ['2']
This was an interesting exercise, and helpful to me to see debugging features from other parser libraries.