views:

823

answers:

1

I have written a parser in boost:spirit and now I wish to write the same stuff in antlr 3.1.1 to see if there are any performance gains or if it might be a better way to go about it as it also exports to many other languages besides c++ (The current 3.x branch actually does not export to c++).

3.1.1 is built using 2.7.x yet, it supports LL(*) rather than LL(k) so if we go with antrl it will be with the 3.x tree.

I have been successful generating and using the java examples and that is my main source of reversing howto do the c/c++ stuff.

Here are my problems:

  • no native c++ code generation -- you have to use hacks like extern and compile c as c++

  • even getting some things working in the c stuff produces a bunch of warnings which just doesn't make me feel good about what I'm doing

  • all the examples I've looked at and read on 3.x refer to old code making me sift through coredumps and the doxygen documentation to figure out what variable was renamed

The best resources I've found so far are:

http://www.antlr.org/wiki/display/ANTLR3/Five+minute+introduction+to+ANTLR+3

http://rails.wincent.com/wiki/C_language_target_(ANTLR_3.0_prerelease)

both of these, however, have already made me resort to various hacks

So my question is -- can someone point me to a full example from grammar --> code generation --> working executable?

+7  A: 

answering my own question here, although it's hardly a good answer:

you'll need the two following files:

main.c

#import "WalrusLexer.h"
#import "WalrusParser.h"

int main(int argc, char *argv[])
{   
    pANTLR3_UINT8               input_string = (pANTLR3_UINT8)"hello world";
    pANTLR3_INPUT_STREAM        stream = antlr3NewAsciiStringInPlaceStream(input_string, strlen("hello world"), (pANTLR3_UINT8)"hello world");

    if (stream == (pANTLR3_INPUT_STREAM)ANTLR3_ERR_NOMEM)
    { 
      fprintf(stderr, "no memory\n");
      exit(EXIT_FAILURE);
    }
    else if (stream == (pANTLR3_INPUT_STREAM)ANTLR3_ERR_NOFILE)
    { 
      fprintf(stderr, "file not found\n");
      exit(EXIT_FAILURE);
    }

    pWalrusLexer                lexer  = WalrusLexerNew(stream);

    /* modified the next line from the wincent article */
    pANTLR3_COMMON_TOKEN_STREAM tokens = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lexer));

    pANTLR3_COMMON_TOKEN        token;

    pWalrusParser               parser = WalrusParserNew(tokens);

    parser->r(parser);

    printf("yeh!\n");

    return 0;
}

Walrus.g

grammar Walrus;
options { language = C; }
r: ID ' ' ID;
ID: 'a'..'z'+
    {printf("found a word! \%s\n", GETTEXT()->chars);}
;

I'm using Antlr 3.1.1 which exists in my /usr/local so I add it to my classpath before entering screen ala:

export CLASSPATH="$CLASSPATH:/usr/local/antlr-3.1.1/lib/antlr-3.1.1.jar:/usr/local/antlr-3.1.1/lib/stringtemplate-3.2.jar:/usr/local/antlr-3.1.1/lib/antlr-2.7.7.jar"

Also, for some strange reason I had to statically link my antlr libraries. (no, I'm not on a mac)-- haven't found the answer to that so I ended up just symlinking it via:

  ln -s libantlr3c.a libantlr.a

in /usr/local/lib.

After you are all set up we generate our c code from our grammar file -- c++ generation is afaik only supported on the 2.7 branch and we want to use 3.x because of it's LL(*) capabilities.

So we go ahead and type the following:

java org.antlr.Tool Walrus.g 

This should spit out 4 files, header/class includes for both the lexer and parser: Now we can go ahead and compile this via:

g++ main.c WalrusParser.c WalrusLexer.c -lantlr -o walrus_parser

and run! there is much more to learn but this has got me going and I feel that I'll be able to match my boost:spirit implementation within a couple of days.

feydr