views:

298

answers:

3

For some search-based code (in Python), I need to write a query syntax parser that would parse a simple google like query syntax. For example:

all of these words "with this phrase" OR that OR this site:within.site filetype:ps from:lastweek

As search becomes more an more popular, I expected to be able to easily find a python library for doing this and thus avoid having to re-invent the wheel. Sadly, searches on google doesn't yield much.

What would you recommend as a python parsing library for this simple task?

+1  A: 

PLY is great. It is based on the Lex/Yacc idiom and thus may already be familiar. It allows you to create arbitrarily complex lexers and parsers for any task, including the one you need.

Using a powerful tool like PLY instead of a simple toy is a good idea, because your needs can become more complex with time and you'd like to stay with the same tool.

Eli Bendersky
+4  A: 

While ply is a more classical approach (a Pythonic variant of lexx + yacc) and thus may be easier to get started with if you're already familiar with such traditional tools, pyparsing is highly pythonic and would be my top recommendation, especially for such simple tasks (which are really more like lexing than "full-blown" parsing... at least until you want to allow possibly-nested parentheses, but pyparsing won't really be troubled by those either;-).

Alex Martelli
Thanks for the plug, Alex! The pyparsing examples page includes a simple search query parser (http://pyparsing.wikispaces.com/file/view/searchparser.py), and the Whoosh search library (http://whoosh.ca/) uses pyparsing for its query parsing.
Paul McGuire
+1  A: 
andrew cooke