+2  A: 

I think a parser generator is an overkill. You could use the idea of converting an expression to postfix and evaluating postfix expressions (or directly building an expression tree out of the infix expression and using that to generate the truth table) to solve this problem.

Mehrdad Afshari
But that is building a parser, all be it a hand rolled one. If you know how to use Lex (or it's likes) you also know how to hand roll one.
Simeon Pilgrim
It *is* a parser but it's one that CS students can do on their first semester for evaluating arithmetic expressions. I doubt the whole program engine would be more than 100 lines of code (incl. evaluation and truth table generation).
Mehrdad Afshari
I agree, the first thing I thought of was my first year Postfix assignment in college. This project would be very similar to that overall.
Neil N
+12  A: 

This sounds like a great personal project. You'll learn a lot about how the basic parts of a compiler work. I would skip trying to use a parser generator; if this is for your own edification, you'll learn more by doing it all from scratch.

The way such systems work is a formalization of how we understand natural languages. If I give you a sentence: "The dog, Rover, ate his food.", the first thing you do is break it up into words and punctuation. "The", "SPACE", "dog", "COMMA", "SPACE", "Rover", ... That's "tokenizing" or "lexing".

The next thing you do is analyze the token stream to see if the sentence is grammatical. The grammar of English is extremely complicated, but this sentence is pretty straightforward. SUBJECT-APPOSITIVE-VERB-OBJECT. This is "parsing".

Once you know that the sentence is grammatical, you can then analyze the sentence to actually get meaning out of it. For instance, you can see that there are three parts of this sentence -- the subject, the appositive, and the "his" in the object -- that all refer to the same entity, namely, the dog. You can figure out that the dog is the thing doing the eating, and the food is the thing being eaten. This is the semantic analysis phase.

Compilers then have a fourth phase that humans do not, which is they generate code that represents the actions described in the language.

So, do all that. Start by defining what the tokens of your language are, define a base class Token and a bunch of derived classes for each. (IdentifierToken, OrToken, AndToken, ImpliesToken, RightParenToken...). Then write a method that takes a string and returns an IEnumerable'. That's your lexer.

Second, figure out what the grammar of your language is, and write a recursive descent parser that breaks up an IEnumerable into an abstract syntax tree that represents grammatical entities in your language.

Then write an analyzer that looks at that tree and figures stuff out, like "how many distinct free variables do I have?"

Then write a code generator that spits out the code necessary to evaluate the truth tables. Spitting IL seems like overkill, but if you wanted to be really buff, you could. It might be easier to let the expression tree library do that for you; you can transform your parse tree into an expression tree, and then turn the expression tree into a delegate, and evaluate the delegate.

Good luck!

Eric Lippert
You mentioned using an IEnumerable to represent the token stream. What would you suggest using to represent the AST?
KingNestor
A program is a sequence of tokens, but only ONE abstract syntax tree. Usually what you do is define a "program" node that can contain any possible program, but in your case the grammar will be really simple; it'll probably just be binary expressions. I'd just have a base class Expression, and then a bunch of derived classes, OrExpression, ImpliesExpression, IdentifierExpression, and so on. An OrExpression has two children, which are themselves Expressions. And so on.
Eric Lippert
So that's a compiler in less than 1000 words .. brilliant stuff
flesh
Eric - it sounds like the hardest step above is the semantic analysis (if i have undertsood correctly, the translation of the IEnumberable to a tree with meaning). How do you suggest you go about this? I understand the concept of an AST but how do you know where to start within the IEnumerable?
flesh
+1  A: 

As Mehrdad mentions you should be able to hand roll the parsing in the same time as it would take to learn the syntax of a lexer/parser. The end result you want is some Abstract Syntax Tree (AST) of the expression you have been given.

You then need to build some input generator that creates the input combinations for the symbols defined in the expression.

Then iterate across the input set, generating the results for each input combo, given the rules (AST) you parsed in the first step.

How I would do it:

I could imagine using lambda functions to express the AST/rules as you parse the tree, and building a symbol table as you parse, you then could build the input set, parsing the symbol table to the lambda expression tree, to calculate the results.

Simeon Pilgrim
A: 

If your goal is processing boolean expressions, a parser generator and all the machinery that go with is a waste of time, unless you want to learn how they work (then any of them would be fine).

But it is easy to build a recursive-descent parser by hand for boolean expressions, that computes and returns the results of "evaluating" the expression. Such a parser could be used on a first pass to determine the number of unique variables, where "evaluation" means "couunt 1 for each new variable name". Writing a generator to produce all possible truth values for N variables is trivial; for each set of values, simply call the parser again and use it to evaluate the expression, where evaluate means "combine the values of the subexpressions according to the operator".

You need a grammar:

formula = disjunction ;
disjunction = conjunction 
              | disjunction "or" conjunction ;
conjunction = term 
              | conjunction "and" term ;
term = variable 
       | "not" term 
       |  "(" formula ")" ;

Yours can be more complicated, but for boolean expressions it can't be that much more complicated.

For each grammar rule, write 1 subroutine that uses a global "scan" index into the string being parsed:

  int disjunction()
 // returns "-1"==> "not a disjunction"
 // in mode 1:
 // returns "0" if disjunction is false
 // return  "1" if disjunction is true
 { skipblanks(); // advance scan past blanks (duh)
   temp1=conjunction();
   if (temp1==-1) return -1; // syntax error
   while (true)
     { skipblanks();
       if (matchinput("or")==false) return temp1;
       temp2= conjunction();
       if (temp2==-1) return temp1;
       temp1=temp1 or temp2;
     }
  end

  int term()
  { skipblanks();
    if (inputmatchesvariablename())
       { variablename = getvariablenamefrominput();
         if unique(variablename) then += numberofvariables;
         return lookupvariablename(variablename); // get truthtable value for name
       }
     ...
  }

Each of your parse routines will be about this complicated. Seriously.

Ira Baxter
A: 

Hi. You can get source code of pyttgen program at http://code.google.com/p/pyttgen/source/browse/#hg/src It generates truth tables for logical expressions. Code based on ply library, so its very simple :)

RANUX