views:

426

answers:

4

Is there a built parser that I can use from C# that can parse mathematica expressions?

I know that I can use the Kernel itself to parse an expression, and use .NET/Link to retrieve the tree structure... But I'm looking for something that doesnt rely on the Kernel.

A: 

I don't think such a thing exists already (I'd love to know about it). But it may be useful that within Mathematica you can apply the function FullForm to any expression and get something very easy to parse, kind of like an s-expression in Lisp. For example,

FullForm[a+b*c]

yields

Plus[a, Times[b,c]]

That's the underlying representation of all Mathematica expressions and should be straightforward to parse.

dreeves
Yes.. but for that I need the Kernel. Anyways... I think you are right. Such parser doesn't seem to exist. Part of the problem is that there is no published grammar for the language. I've also heard that the language cannot be parsed with a LALR parser.
Nestor
A: 

My matheclipse-parser module implements a parser in Java which can parse a big subset of mathematica expressions. See the MathExpressionParser wiki page for usage. Maybe you can port the parser to C#?

axelclk
A: 

The mathematica grammar isn't well documented, true. But AFAIK, it is LALR(1) and likely LL(1); the bracketed /tagged syntax from gives the parser complete clues about what to expect next, just like LISP and XML.

The DMS Software Reengineering Toolkit does have a Mathematica grammar that has been used for real tasks. This includes MMa programs as well as pure expression forms.

That probably doesn't help you, since you want one in C#.

If you have access to the Kernal, I'd stick to that.

Ira Baxter
+2  A: 

I wrote a Mathematica parser in 300 lines of OCaml code under contract for Wolfram Research and found it to be quite easy because the grammar is clearly documented in their literature and any ambiguities are easily found by playing with Mathematica itself.

Jon Harrop
Really? Interesting. Why would they want a parser for their language? They already have that, don't they?
Nestor
Their Mathematica parser written in C was about 100x longer than my Mathematica parser written in OCaml. At the time, they were building the core of Wolfram Workbench which contains rewrites in Java of much of Mathematica's original C source including its parser and they were particularly interested in any techniques that could be used to simplify it.
Jon Harrop
Why the downvote?
Jon Harrop
That's fascinating Jon. Thanks for sharing. You should rewrite the parser in ANTLR and share it with us :-) (BTW, I didnt downvote you)
Nestor