tags:

views:

545

answers:

4

I think this is a stupid question, but I'm just starting out with ANTLR. I put together the "SimpleCalc" grammar from their tutorials, and generated it with C as the target language. I got SimpleCalcParser.c/.h and SimpleCalcLexer.c/.h as the output, and I was able to compile these and build successfuly. But now, how do I actually use the code that's generated? I'm having trouble finding anything in the docs that's helpful.

Below is my main() function. This is also from the tutorial.

 #include "SimpleCalcLexer.h"

 int main(int argc, char * argv[])
 {

    pANTLR3_INPUT_STREAM           input;
    pSimpleCalcLexer               lex;
    pANTLR3_COMMON_TOKEN_STREAM    tokens;
    pSimpleCalcParser              parser;

    input  = antlr3AsciiFileStreamNew          ((pANTLR3_UINT8)argv[1]);
    lex    = SimpleCalcLexerNew                (input);
    tokens = antlr3CommonTokenStreamSourceNew  (ANTLR3_SIZE_HINT, TOKENSOURCE(lex));
    parser = SimpleCalcParserNew               (tokens);

    parser  ->expr(parser);

    // Must manually clean up
    //
    parser ->free(parser);
    tokens ->free(tokens);
    lex    ->free(lex);
    input  ->close(input);

    return 0;
 }

EDIT: Per the first response, I should say that I ran the program like this: "./testantlr test.txt", where test.txt contained "4+1". There was no output.

From here, how would I, for example, access the "4" in the generated syntax tree, or print out the entire syntax tree? Basically, how do I access stuff in the syntax tree that ANTLR generates?

A: 

So when you run this program you gave here the first command line argument will be the name of the file to parse.

Step one, try that (run it and give it a file).

Step two, come back and edit your question but change the direction slightly. Instead of asking "how do I use the code" try asking "how do I do __________ with this" where the blank is replaced by some description of what you are trying to accomplish.

So parser->expr(parser) appears to be parsing the token stream coming from your file, which should produce a AST. Guessing my way through lots of details I'd suggest looking at what it returns, esp. the value member if it has one. There seem to be a slew of tutorials on line that look similar to what you are doing, no two identical.

If all else fails, continue on with the tutorial and either 1) it will answer your questions or 2) it won't, and you can try another.

MarkusQ
A: 

I don't mean to be rude, but it seems like you don't really know what the purpose of ANTLR is. I think you need to understand this before you can attempt to use the files it generates.

The very short answer is that ANTLR is a parser generator, which means that it generates code to parse text. A common usage is to parse the text of a programming language. I haven't read the tutorial you're referring to, but I'd guess the text this parser parses is a set of calculator instructions, something like

ADD 2 4
MULTIPLY 4 8

In order to use the program you've shown above, you would execute it like any other C program. The first argument (argc) should be the number of arguments, and the second (argv) should be the text to be parsed.

In order to learn ANTLR from the ground up, I recommend you read the book published by Terence Parr, the author of ANTLR.

Don
+1  A: 

I faced the same perplexment when I first took a crack at it. It's a pretty obvious question/issue, which makes it more weird that it doesn't seem to be explicitly and straightforwardly addressed in tutorials.

The way out of the perplexment that I found is the 'returns' keyword:

token returns [TreeNode value]
    :    WORD { $value = new TreeNode( "word", $WORD.Text ); }
    |    INT { $value = new TreeNode( "int", $INT.Text ); }
    ;

WORD:    ('a'..'z'|'A'..'Z')+;
INT :    ('0'..'9')+;

TreeNode is a class that I made. Where it got tricky was how to do this with a sequence of say, multiple tokens. The solution I came up with was recursion:

expr returns [Accumulator value]
    :   a=token  (WS+ b=expr)?
    {
        if( b != null )
        {
            $value = new Accumulator( "expr", a.value, b.value );
        } else
        {
            $value = new Accumulator( "expr", a.value );
        }
    }
    ;

Accumulator is a class that I made that has two different constructors. One constructor encapsulates a single token, and the other encapsulates a single token and another Accumulator instance. Notice the rule itself is defined recursively, and that b.value is an Accumulator instance. Why? Because b is an expr, and the definition of expr has returns [Accumulator value].

The final resulting tree is a single Accumulator instance that has grouped up all the tokens. To actually use that tree, you do some setup and then call the method that has the same name as the rule with respect to which you're parsing your content:

Antlr.Runtime.ANTLRStringStream stringstream =  new Antlr.Runtime.ANTLRStringStream( script );
TokenLexer lexer = new TokenLexer( stringstream );
Antlr.Runtime.CommonTokenStream tokenstream = new Antlr.Runtime.CommonTokenStream( lexer );
TokenParser parser = new TokenParser( tokenstream );

Accumulator grandtree = parser.expr().value;

Hope this helps people who encounter this perplexion.

amr
A: 

Take a look at Scott Stanchfield's part 8 video http://vimeo.com/groups/29150/videos/8377479. He's doing it in Java, but same principle can be applied in C(++).

eisbaw