ansaurus

Question

Is the ANTLR parser generator best for a C++ app with constrained memory?

Answer 1

A:

ANTLR parsers, and in fact any parser built with something LALR or the like, tend to be big. Do you have an actual grammar for this? It looks like it might be most readily parsed with a hand-written recursive-descent parser, but it's not much of a sample.

Oops, my mistake, as ANTLR apparently generates recursive-descent. Still, I've had problems with ANTLR generating big parsers.

Charlie Martin 2009-04-21 00:19:39

It's only a tiny sample -- we do have a hand written parser for the current format, but it's become so complicated that maintenance and adding new features is an abject nightmare. I'm wondering whether parsing it with ANTLR would be easier, or harder still.I've built an LR grammar for our format in ANTLRworks and it seems to syntax-check okay, but the next huge step would be to actually hook up C actions and link it into our program.

Crashworks 2009-04-21 00:30:17

How big is the generated parser? The ugly part is usually the parser structure itself.

Charlie Martin 2009-04-21 00:54:42

You know, you just convinced me to move teh ANTLR book up my to-read stack. ANTLR v3 might solve some of the problems I was seeing in 2005-2006.

Charlie Martin 2009-04-21 02:12:25

Answer 2

+1 A:

We use Boost Spirit successfully in our application. The Boost license is a very liberal one, so there is no problem using it in commercial applications.

Quote from the documentation:

Spirit is an object-oriented recursive-descent parser generator framework implemented using template meta-programming techniques. Expression templates allow us to approximate the syntax of Extended Backus-Normal Form (EBNF) completely in C++. The Spirit framework enables a target grammar to be written exclusively in C++. Inline EBNF grammar specifications can mix freely with other C++ code and, thanks to the generative power of C++ templates, are immediately executable. In retrospect, conventional compiler-compilers or parser-generators have to perform an additional translation step from the source EBNF code to C or C++ code.

lothar 2009-04-21 00:49:54

Answer 3

A:

Then why don't you use flex/yacc? It generates C code,can be run from MSVC, was developped with efficiency in mind, can have malloc overriden (google for yymalloc), they are themselves GPL, but the resulting code (the code you use in your project) AFAIK not.

Or use a hand-made parser.

jpalecek 2009-04-21 06:58:47

Answer 4

A:

ANTLR 3 doesn't support C++; it claims to generate straight C but the docs on getting it to actually work are sort of confusing.

It does generate C, and furthermore, it works with Visual Studio and C++. I know this because I've done it before and submitted a patch to get it to work with stdcall.

Memory is at a huge premium in our app and even tiny leaks are fatal. I need to be able to override the parser's memory allocator to use our custom malloc(), or at the very least I need to give it a contiguous pool from which it draws all its memory (and which I can deallocate en bloc afterwards). I can spare about 200kb for the parser executable itself, but whatever dynamic heap it allocates in parsing has to get freed afterwards.

The antlr3c runtime, last time I checked does not have a memory leak, and uses the Memory pool paradigm which you describe. However, it does have one shortcoming in the API which the author refuses to change, which is that if you request the string of a node, it will create a new copy each time until you free the entire parser.

I have no comment on the ease of using a custom malloc, but it does have a macro to define what malloc function to use in the entire project.

As for the executable size, my compilation was about 100 kb in size including a small interpreter.

My suggestion to you is to keep learning ANTLR, because it still fits your requirements and you probably need to sacrifice a little more time before it will start working for you.

Unknown 2009-04-21 07:09:11

Answer 5

A:

Realistically, if your grammar is relatively small, and doesn't contain many ambiguities or parsing complexities then it shouldn't matter if you use a recursive descent parser or a shift-reduce parser.

I would say look at ANTLR and Spirit, also look at Flex and Bison. There are other, less known parsers like Coco/R (contains generators for a lot of languages, including C++) also.

2009-06-17 15:31:50

Bison's no good; we must be GPL-free.

Crashworks 2009-06-17 19:44:46

Answer 6

A:

A hand-coded recursive descent parser is actually quite fast and can be very compact. The only downside is you have to be careful to code essentally LL(1) grammars.

You can hand code such parsers as plain recursive C code. If you are really tight on space, you can define a parsing virtual machine, and build a tiny C interpreter to run it. I used to build BASIC interpreters this way back the 70s.

The ideas came from a 1964 paper on metacompilers by Val Schorre, who shows how to build complete compilers in 10 pages Shorres tiny parser generator that does pretty good recursive descent parsers. A site describing this paper and showing precisely how to build such parsers can be found at http://www.bayfronttechnologies.com/metaii.html

Ira Baxter 2009-06-27 02:15:00

ansaurus

tags:

views:

answers:

Is the ANTLR parser generator best for a C++ app with constrained memory?

related questions