views:

74

answers:

3

Hm, this is language - agnostic, I would prefer doing it in C# or F#, but I'm more interested this time in the question "how would that work anyway".

What I want to accomplish ist:

a) I want to LEARN it - it's about my ego this time, it's for a fun project where I want to show myself that I'm a really good at this stuff

b) I know a tiny little bit about EBNF (although I don't know yet, how operator precedence works in EBNF - Irony.NET does it right, I checked the examples, but this is a bit ominous to me)

c) My parser should be able to take this: 5 * (3 + (2 - 9 * (5 / 7)) + 9) for example and give me the right results

d) To be quite frankly, this seems to be the biggest problem in writing a compiler or even an interpreter for me. I would have no problem generating even 64 bit assembler code (I CAN write assembler manually), but the formula parser...

e) Another thought: even simple computers (like my old Sharp 1246S with only about 2kB of RAM) can do that... it can't be THAT hard, right? And even very, very old programming languages have formula evaluation... BASIC is from 1964 and they already could calculate the kind of formula I presented as an example

f) A few ideas, a few inspirations would be really enough - I just have no clue how to do operator precedence and the parentheses - I DO, however, know that it involves an AST and that many people use a stack

So, what do you think?

+2  A: 

You should go learn about Recursive Descent parsers.

Check out a Code Golf exercise in doing just this, 10 different ways:

http://stackoverflow.com/questions/1384811/code-golf-mathematical-expression-evaluator-full-pemdas

Several of these "golf" solutions are recursive descent parsers just coded in different ways.

You'll find that doing just expression parsing is by far the easiest thing in a compiler. Parsing the rest of the language is harder, but understanding how the code elements interact and how to generate good code is far more difficult.

You may also be interested in how to express a parser using BNF, and how to do something with that BNF. Here's an example of how to parse and manipulate algebra symbolically with an explicit BNF and an implicit AST as a foundation. This isn't what compilers traditionally do, but the machinery that does is founded deeply in compiler technology.

Ira Baxter
Oh wow, that helps a lot! Not quite sure about "easiest" yet, but... ;-)
StormianRootSolver
Code golf is not the best way to learn how to do something like this. Or indeed to learn anything other than code golf.But you're right about expression parsing not being the hard part.
Don Roby
In general, code golf isn't. Several the answers in fact are coded pretty clearly, though, and because they aren't very long, you can learn a lot from them.
Ira Baxter
+1  A: 

Traditionally formula processors on computers use POSTFIX notation. They use a stack, pop 2 items as operators, pop third item as the operator, push the result.

What you want is an INFIX to POSTFIX notation converter which is really quite simple. Once you're in postfix processing is the simplest thing you'll ever do.

Matt S
s/pop 2 items as operators/pop 2 items as operands
tpdi
Postfix notation! Yes, I will try that first! Thank you! :-)
StormianRootSolver
What you said you wanted to do was *infix* expressions, not postfix.
Ira Baxter
+1  A: 

For a stack-based parser implemented in PHP that uses Djikstra's shunting yard algorithm to convert infix to postfix notation, and with support for functions with varying number of arguments, you can look at the source for the PHPExcel calculation engine

Mark Baker