views:

40

answers:

1

A lot of websites states that packrat parsers can parse input in linear time.
So at the first look they me be faster than LALR parser contructed by the tools yacc or bison.

I wanted to know if the performance of packrat parsers is better/worse than the performance of LALR parser when tested with common input (like programming language source files) and not with any theoretical inputs.

Does anyone can explain the main differences between the two approaches.
Thanks!

+2  A: 

I'm not an expert at packrat parsing, but you can learn more at Wikipedia Parsing expression grammar.

I haven't dug into it so I'll assume the linear-time characterizatoin of packrat parsing is correct.

L(AL)R parsers are linear time parsers too. So in theory, neither packrat nor L(AL)R parsers are "faster".

What matters, in practice, of course, is implementation. L(AL)R state transitions can be executed in very few machine instructions ("look token code up in vector, get next state and action") so they can be extremely fast in practice. By "compiling" L(AL)R parsing to machine code, you can end up with lightning fast parsers, as shown by this 1986 Tom Pennello paper on Very Fast LR parsing. (Machines are now 20 years faster than when he wrote the paper!).

If packrat parsers are storing/caching results as they go, they may be linear time, but I'd guess the constant overhead would be pretty high, and then L(AL)R parsers in practice would be much faster. The YACC and Bison implementations from what I hear are pretty good.

If you care about the answer, read the basic technical papers closely; if you really care, then implement one of each and check out the overhead constants. My money is on L(AL)R.

An observation: most language front-ends don't spend most of their time "parsing"; rather, they spend a lot of time in lexical analysis. Optimize that (your bio says you are), and the parser speed won't matter much.

(I used to build LALR parser generators and corresponding parsers. I don't do that anymore; instead I use GLR parsers which are linear time in practice but handle arbitrary context-free grammmars. I give up some performance, but I can [and do, see bio] build dozens of parsers for many languages without a lot of trouble.).

Ira Baxter
It is possible to read the document (1986 Tom Pennello paper on Very Fast LR parsing) for free?
youllknow
Sure. Visit your local university's computer science library :-} I think the ACM will charge you only modest cost; IMHO its well worth it to make sure that ACM continues to make stuff like this available.
Ira Baxter