Choosing a Haskell parser

views:

243

answers:

+8 Q:

Choosing a Haskell parser

There are many open sourced parser implementations available to us in Haskell. Parsec seems to be the standard for text parsing and attoparsec seems to be a popular choice for binary parsing but I don't know much beyond that. Is there a particular decision tree that you follow for choosing a parser implementation? Have you learned anything interesting about the strengths or weaknesses of the libraries?

+14 A:

You have several good options.

For lightweight parsing of String types:

For packed bytestring parsing, e.g. of HTTP headers.

attoparsec

For actual binary data most people use either:

binary -- for lazy binary parsing
cereal -- for strict binary parsing

The main question to ask yourself is what is the underlying string type?

String?
bytestring (strict)?
bytestring (lazy)?
unicode text

That decision largely determines which parser toolset you'll use.

The second question to ask is: do I already have a grammar for the data type? If so, I can just use happy

The Happy parser generator

And obviously for custom data types there are a variety of good existing parsers:

XML
- haxml
- xml-light
- hxt
- hexpat
CSV
- bytestring-csv
- csv
JSON
- json
rss/atom
- feed

Don Stewart 2010-06-19 21:13:59

thanks for the detailed answer

Keith 2010-06-19 22:18:07

+5 A:

Just to add to Don's post: Personally, I quite like Text.ParserCombinators.ReadP (part of base) for no-nonsense quick and easy stuff. Particularly when Parsec seems like overkill.

There is a bytestringreadp library for the bytestring version, but it doesn't cover Char8 bytestrings, and I suspect attoparsec would be a better choice at this point.

Sam Martin 2010-06-20 00:57:35

+1 A:

Bryan O’Sullivan’s blog post What’s in a parser? Attoparsec rewired (2/2) includes a nice performance benchmark comparing several implementations along with some comments comparing memory usage.

Keith 2010-06-20 15:44:17

+2 A:

I recently converted some code from Parsec to Attoparsec. Both are quite capable.

Attoparsec wins on performance and memory footprint, but Parsec provides better error reporting and has more complete documentation.

Dan Dyer 2010-06-20 20:53:57

ansaurus

tags:

views:

answers:

Choosing a Haskell parser

related questions