I'm looking for a good parser generator that I can use to read a custom text-file format in our large commercial app. Currently this particular file format is read with a handmade recursive parser but the format has grown and complexified to the point where that approach has become unmanageable.
It seems like the ultimate solution would be to build a proper grammar for this format and then use a real parser generator like yacc to read it, but I'm having trouble deciding which such generator to use or even if they're worth the trouble at all. I've looked at ANTLR and Spirit, but our project has specific constraints beyond earlier answers that make me wonder if they're as appropriate for us. In particular, I need:
- A parser that generates C or C++ code with MSVC. ANTLR 3 doesn't support C++; it claims to generate straight C but the docs on getting it to actually work are sort of confusing.
- Severely constrained memory usage. Memory is at a huge premium in our app and even tiny leaks are fatal. I need to be able to override the parser's memory allocator to use our custom malloc(), or at the very least I need to give it a contiguous pool from which it draws all its memory (and which I can deallocate en bloc afterwards). I can spare about 200kb for the parser executable itself, but whatever dynamic heap it allocates in parsing has to get freed afterwards.
- Good performance. This is less critical but we ought to be able to parse 100kb of text in no more than a second on a 3ghz processor.
- Must be GPL-free. We can't use GNU code.
I like ANTLRworks' IDE and debugging tools, but it looks like getting its C target to actually work with our app will be a huge undertaking. Before I embark on that palaver, is ANTLR the right tool for this job?
The text format in question looks something like:
attribute "FluxCapacitance" real constant
asset DeLorean
//comment foo bar baz
model "delorean.mdl"
animation "gullwing.anm"
references "Marty"
template TimeMachine
attribute FluxCapacitance 10
asset DeLorean