views:

285

answers:

2

I am looking for the best solution for a LALR parser generator for C++ that will allow me to generate really good error messages. I really hate the syntax errors that MySQL generates and I want to take the parser in it and replace it with a "lint" checker that will tell me more than just

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'a from users' at line 1

I have used YACC/LEX and BISON/FLEX. It has to work on Mac or Linux.

+2  A: 

Why do you require LALR? One of the benefits of LL(k) parsers is that they can often make it easier to generate clear error messages. Most grammars that can be parsed by an LALR parser can be easily refactored to be parsable by an LL(k) parser.

ANTLR is a popular LL(k) parser generator that can generate C++ (as well as a number of other languages). From Chapter 10 of The Definitive ANTLR Reference:

The quality of a language application’s error messages and recovery strategy often makes the difference between a professional application and an amateurish application. Error recovery is the process of recovering from a syntax error by altering the input stream or consuming symbols until the parser can restart in a known state. Many hand-built and many non-LL-based recognizers emit less than optimal error messages, whereas ANTLR-generated recognizers automatically emit very good error messages and recover intelligently, as shown in this chapter.

Many grammars are also available for ANTLR, including a MySQL grammar.

Laurence Gonsalves
Good Suggestion - I will go and take a look at it. Just because I have worked with LALR before should not mean that I should use it now.
Philip Schlump
Ok. I am impressed so far. I read all of Chapter 10 and it looks very promising.
Philip Schlump
This looks very promising!
Philip Schlump
I just read it, and I disagree. Classis programmer mistake, assuming the user is similar to the programmer. In this case the author seems to think that the "error consumer", i.e. the person who created the input understands the parsing process. As an example, consider the error message he suggests for the input `(3;)` : "line 1:2 mismatched input ';' expecting ')'". Sorry, but this doesn't tell me _why_ that's expected. It's because of the '(' at line 1:1, but that important information is left out.
MSalters
Your answer was excellent. I have a working parser doing what I want - it is still a little rough - but I need some sleep. I have been up all night working on this.
Philip Schlump
A: 

If you find that ANTLR doesn't completely solve your problem then you might consider basil. It is an LR(1) parser generator that was designed and written to create a C++ parser.

Richard Corden
I will take a look at basil - but I have already gotten ANTLR to do a very good job.
Philip Schlump