views:

117

answers:

3

Hello all,

I'm doing some small projects which involve having different syntaxes for something, however sometimes these syntaxes are so easy that using a parser generator might be overkill.

Now, when should I use a hand-made parser, and when should I use a parser generator?

Thanks,

William van Doorn

+5  A: 

There is no hard-and-fast answer, other than "use whatever is easiest for the particular situation".

My experience is that parsers tend to get more complicated over their lifetimes, so using a parser generator up front usually pays off. Even if the language doesn't get more complicated, using a generator forces you to create a formal specification of the syntax, which is itself valuable.

The downsides are that other programmers may not know how to use the generator, so it makes it difficult for others to help out, and it makes your project dependent on that generator.

Kristopher Johnson
+1  A: 

Your question title suggests that using a grammar is optional. It really isn't - even if I was going to implement a tiny language, I'd sketch out a grammar on a single sheet of paper.

As for when to use parser generators, this is really personal preference. Many people believe in hand-writing recursive descent parsers, rather than using the table-driven approach, for example. The important thing is to be comfortable in understanding the capabilities of the generator.

And don't be thinking that using parser generators is somehow the more professional, or even the easier approach. Bjarne Stroustrup when writing the first C++ compiler intended to use recursive descent, but got talked out of it by some keen colleagues at Bell Labs, much to his eventual chagrin. See section 3.3.2 of The Design and Evolution of C++ for more details.

anon
+4  A: 

It's worth coding the parser by hand if, and only if, you're super-keen to have it be extremely fast even on a machine of very modest speed. For example, in this article on the history of Turbo Pascal from before it got its name, you can see how and why the prototype impressed the small (then Danish) firm "Borland" to hire the prototype's author (Anders Hejlsberg), fully develop the compiler, and launch it as its main product, and I quote...:

with no great expectations I hit the compile key - AND THEN I WAS COMPLETELY FLOORED! My test program, that took minutes to compile and link using Digital Research’s Pascal MT+, was compiled and running before I could blink an eye! That was a great WOW moment!

Turbo Pascal's amazing compile speed -- coming first and foremost from a carefully hand-coded and highly tuned recursive descent parser coded in assembly language -- allowed it to use a very different strategy from most compilers: no separate compilation pass generating object files and libraries, and then a linker to put them together, rather, Turbo Pascal 1.0 was a single-pass compiler that directly turned source code into a single executable binary.

I remember just the same amazing experience on the tiny personal computers of that era (when a Z80, 64K or RAM, and two floppies was a lot;-) -- Turbo Pascal, with its amazing parser and the IDE and everything else, fit comfortably in memory together with a substantial program in both source and compiled form -- no floppies were needed, which meant many orders of magnitude of difference in program turnaround time.

If Hejlsberg had stuck to what was already the traditional wisdom at the time -- always use parser generators -- Turbo Pascal would probably never have emerged as a commercial product, and definitely not achieved the dominance in the Pascal world it enjoyed for years.

Of course, on a typical PC of today, such extreme parsing speed would not be needed for most compilers. Possible exceptions include compilers that must run seamlessly as part of an "interpreter-like" environment (the simple compilers for languages such as Perl and Python are typically hand-coded, to substantial extents, for that reason -- that was an implementation choice that made them viable in the '90s, although today it's not clear it's still needed), or compilers that run on very limited hardware resources, such as smartphones or low-end netbooks.

In the vast majority of cases in which you'll be writing a compiler, none of these performance considerations probably apply, and you'll be happier with a parser generator.

Alex Martelli
Turbo Pascal (on a DOS box at school) was the first compiler I ever used; before that I'd been hacking basic and 6502 assembly on Dad's Apple ][+. So, when I inherited a copy of Apple Pascal I was just *baffled* at how long it took to run. Memories.
dmckee