ansaurus

Question

Answer 1

+8 A:

You need to write a lexical analyzer to tokenize the input (break it into its component parts--operators, punctuators, identifiers, etc.). Inevitably, you'll end up with some sequence of tokens.

After that, there are a number of ways to evaluate the input. One of the easiest ways to do this is to convert the expression to postfix using the shunting yard algorithm (evaluation of a postfix expression is Easy with a capital E).

James McNellis 2010-05-28 19:54:11

Answer 2

+1 A:

Also, you should review your posts to SO and other posts regarding Binary Trees. Implement this using a tree structure. Traverse as infix to evaluate. There have been some excellent answers to tree questions.

If you need to store this (for persistance as in a file), I suggest XML. Parsing XML should make you really appreciate how easy your assignment is.

Thomas Matthews 2010-05-28 20:14:05

Answer 3

+1 A:

You should look up "abstract syntax trees" and "expression trees" as well as "lexical analysis", "syntax", "parse", and "compiler theory". Reading text input and getting meaning from it is quite difficult for most things (though we often try to make sure we have simple input).

The first step in generating a parser is to write down the grammar for your input language. In this case your input language is some Mathematical expressions, so you would do something like:

expr => <function_identifier> ( stmt )
        ( stmt )
        <variable_identifier>
        <numerical_constant>

stmt => expr <operator> stmt

(I haven't written a grammar like this {look up BNF and EBNF} in a few years so I've probably made some glaring errors that someone else will kindly point out) This can get a lot more complicated depending on how you handle operator precedence (multiply and device before add and subtract type stuff), but the point of the grammar in this case is to help you to write a parser.

There are tools that will help you do this (yacc, bison, antlr, and others) but you can do it by hand as well. There are many many ways to go about doing this, but they all have one thing in common -- a stack. Processing a language such as this requires something called a push down automaton, which is just a fancy way of saying something that can make decisions based on new input, a current state, and the top item of the stack. The decisions that it can make include pushing, popping, changing state, and combining (turning 2+3 into 5 is a form of combining). Combining is usually referred to as a production because it produces a result.

Of the various common types of parsers you will almost certainly start out with a recursive decent parser. They are usually written directly in a general purpose programming language, such as C. This type of parser is made up of several (often many) functions that call each other, and they end up using the system stack as the push down automaton stack.

Another thing you will need to do is to write down the different types of words and operators that make up your language. These words and operators are called lexemes and represent the tokens of your language. I represented these tokens in the grammar <like_this>, except for the parenthesis which represented themselves.

You will most likely want to describe your lexemes with a set of regular expressions. You should be familiar with these if you use grep, sed, awk, or perl. They are a way of describing what is known as a regular language which can be processed by something known as a Finite State Automaton. That is just a fancy way of saying that it is a program that can make a decision about changing state by considering only its current state and the next input (the next character of input). For example part of your lexical description might be:

[A-Z]   variable-identifier
sqrt    function-identifier
log     function-identifier
[0-9]+  unsigned-literal
+       operator
-       operator

There are also tools which can generate code for this. lex which is one of these is highly integrated with the parser generating program yacc, but since you are trying to learn you can also write your own tokenizer/lexical analysis code in C.

After you have done all of this (it will probably take you quite a while) you will need to have your parser build a tree to represent the expressions and grammar of the input. In the simple case of expression evaluation (like writing a simple command line calculator program) you could have your parser evaluate the formula as it processed the input, but for your case, as I understand it, you will need to make a tree (or Reverse Polish representation, but trees are easier in my opinion).

Then after you have read the values for the variables you can traverse the tree and calculate an actual number.

nategoose 2010-05-29 00:39:49

Answer 4

+1 A:

Possibly the easiest thing to do is use an embedded language like Lua or Python, for both of which the interpreter is written in C. Unfortunately, if you go the Lua route you'll have to convert the binary operations to function calls, in which case it's likely easier to use Python. So I'll go down that path.

If you just want to output the result to the console this is really easy and you won't even have to delve too deep in Python embedding. Since, then you only have to write a single line program in Python to output the value.

Here is the Python code you could use:

exec "import math;A=<vala>;B=<valb>;C=<valc>;D=<vald>;print <formula>".replace("^", "**").replace("log","math.log").replace("ln", "math.log").replace("sin","math.sin").replace("sqrt", "math.sqrt").replace("cos","math.cos")

Note the replaces are done in Python, since I'm quite sure it's easier to do this in Python and not C. Also note, that if you want to use xor('^') you'll have to remove .replace("^","**") and use ** for powering.

I don't know enough C to be able to tell you how to generate this string in C, but after you have, you can use the following program to run it:

#include <Python.h>

int main(int argc, char* argv[])
{
  char* progstr = "...";
  Py_Initialize();
  PyRun_SimpleString(progstr);
  Py_Finalize();
  return 0;
}

You can look up more information about embedding Python in C here: Python Extension and Embedding Documentation

If you need to use the result of the calculation in your program there are ways to read this value from Python, but you'll have to read up on them yourself.

JPvdMerwe 2010-05-29 10:29:42

Heh. I think that's cheating ;-). It is most assuredly easier to do this in Python than in C, but implementing an expression evaluator in C is both interesting and fun (well, it can be; I've never implemented one in C, but I have in C++, and 600 lines of code later I knew a lot more about how expressions can be evaluated `:-)`).

James McNellis 2010-05-30 02:21:24

I agree if the goal is to learn about expression evaluation; otherwise, if it's just an intermediate step, this is both easier and less bug prone. You could compare it to using standard library functionality. It's cheating to use it when you are asked to write your own; otherwise, it makes good sense. :) But yes, I do agree it could be an interesting learning experience I've yet to undertake.

JPvdMerwe 2010-05-30 07:58:23

A valid point: since the OP didn't say what his purpose was, this is a good idea for a solution (+1).

James McNellis 2010-05-30 19:13:12

ansaurus

tags:

views:

answers:

parsing of mathematical expressions

related questions