ansaurus

Question

What are some exotic parsing techniques?

Answer 1

+2 A:

Parser combinators is a very popular method of building parsers in functional languages such as Haskell.

andri 2009-06-02 17:57:59

Answer 2

+1 A:

You might try playing with Hadoop which is an Apache project inspired by Google's MapReduce and GFS.

http://hadoop.apache.org/core/

It won't necessarily make a single machine hum but it allows you to easily and cheaply throw more hardware at the problem.

Alternately, Erlang might be a good option if you've got a multi-core machine. It's good at concurrency and pattern matching.

http://erlang.org/

steamer25 2009-06-02 18:02:38

Answer 3

A:

No particular recommendation but I feel I should point you at this list

ShuggyCoUk 2009-06-02 18:03:28

Answer 4

A:

Read the Dragon Book: http://www.amazon.com/Compilers-Principles-Techniques-Alfred-Aho/dp/0201100886

It covers lexical and syntactical analysis in depth (among other topics). You can use this to help you understand the "language" you are trying to parse to determine the best way to go about it.

mmorrisson 2009-06-02 18:12:08

Answer 5

+1 A:

Wikipedia has a nice overview about parser types, here: http://en.wikipedia.org/wiki/Parser

And a comparison about parser generator tools, here: http://en.wikipedia.org/wiki/Comparison_of_parser_generators

I think GLR is a kind of less well known method which is interesting because it deals with language ambiguities.

Vizu 2009-06-02 18:13:41

Answer 6

+3 A:

If you're looking to maximize speed, then you might do better to use OcamlYacc/FsYacc over ANTLR. ~~OcamlYacc creates LL(1) parsers, which typically have better performance than ANTLR-style LL(*) parsers (someone can correct me if I'm wrong).~~ [Edit to add:] Looks like someone corrected me: OCamlYacc produces LALR(1) parsers. I can't say with any confidence whether OcamlYacc parsers are faster than ANTLR parsers.

OCaml/F# are very good languages for building a DSL, and in my opinion much more appropriate for the job than Java, mostly because its ridiculously easy to create and traverse an AST represented as a union data structure. I recommed this tutorial which demonstrates how to parse SQL in F#.

Juliet 2009-06-02 18:28:04

I don't have much exp. with the diff. parsers out there but everything I've read has said the time/space complexities are in favor of LL(1) and also incidentally solve a few problems in LL(k)/LL(*)

feydr 2009-06-02 19:12:22

This is incorrect. ocamlyacc, like yacc and bison, produces LALR(1), not LL(1).

Bob Aman 2009-06-12 20:18:01

Looking for a SQL parser, and I like F#, so +1

James Hugard 2009-07-14 02:35:22

Answer 7

A:

Recursive Descent Parsing might work for you. It is very customizable. It may be a bit slower than yacc/antlr, but may be fast enough. The basic idea: You encode every grammar rule as a function.

Yuval F 2009-06-02 18:37:59

ANTLR generates recursive-descent parsers (as do most LL-based parser generators)

Scott Stanchfield 2009-06-02 19:08:57

Answer 8

+3 A:

Since you're looking for exotic, read this article about Vaughan Pratt's Top Down Operator Precedence...

http://javascript.crockford.com/tdop/tdop.html

Nosredna 2009-06-02 18:45:18

Answer 9

+1 A:

Since you are talking about using OCaml for parsing, this page gives an overview of different parsing options for that language:

Parser generators for the OCaml language

If you decide to settle for ocamlyacc (or menhir), these tutorials may be a little easier than the reference manual:

Bruno De Fraine 2009-06-05 21:19:02

Answer 10

+2 A:

You have to ask yourself if what you really want to do is play around with parsers (admittedly fun, and what I prefer myself) or if you want to actually get work done on your poker bot. Mostly likely, exotic parsing techniques are overkill for what you need. Just choose a fast language with some straightforward, easy to use parsers. You should probably be able to process 10k hands / sec with straight C + flex. Or, ocamllex + ocamlyacc should be more than enough. If you have to hadoopify your code I think you're doing something wrong. Network latency should end up being your real bottleneck, not parsing speed. What kind of machine are you running this on?

Another alternative is using a parser generator to autogenerate a parse table, and then hand optimizing that, or hand optimizing from the NFA (you probably won't save much though, and the tradeoff in programmer time probably isn't worth it). Combinator parsing is likely going to be slower.

On average, for a given grammar of equivalent power LL will be slower than LALR. In particular, if the poker hands are actually parseable by an LALR parser, then bison/byacc + flex will beat ANTLR hands down, every time. I'm personally pretty happy with menhir, though it's a raging bitch and a half to get working with godi + ocamlbuild.

--Nico

2009-06-12 17:11:13

ansaurus

tags:

views:

answers:

What are some exotic parsing techniques?

related questions