views:

980

answers:

2

I want to create a SQL interface on top of a non-relational data store. Non-relational data store, but it makes sense to access the data in a relational manner.

I am looking into using ANTLR to produce an AST that represents the SQL as a relational algebra expression. Then return data by evaluating/walking the tree.

I have never implemented a parser before, and I would therefore like some advice on how to best implement a SQL parser and evaluator.

  • Does the approach described above sound about right?
  • Are there other tools/libraries I should look into? Like PLY or Pyparsing.
  • Pointers to articles, books or source code that will help me is appreciated.
+2  A: 

This reddit post suggests Python-sqlparse as an existing implementation, among a couple other links.

Mark Rushakoff
Thank you for the suggestion. Python-sqlparse looks interesting, I will give it a try.
codeape
+7  A: 

I have looked into this issue quite extensively. Python-sqlparse is a non validating parser which is not really what you need. The examples in antlr need a lot of work to convert to a nice ast in python. The sql standard grammers are here, but it would be a full times jobs worth to convert them yourself and it is likely that you would only need a subset of them i.e no joins. You could try looking at the gadfly (a python sql database) as well, but I avoided it as they used there own parsing tool.

For my case I only essentially needed a where clause. I tryed booleneo (a boolean expression parser) written with pyparsing but ended up using pyparsing from scratch. The first link in the reddit post of Mark Rushakoff gives a sql example using it. Whoosh a full text search engine also uses it but I have not looked at the source to see how.

Pyparsing is very easy to use and you can very easily customize it to not be exactly the same as sql (most of the syntax you will not need). I did not like ply as it uses some magic using naming conventions.

In short give pyparsing a try, it will be most likely be powerful enough to do what you need and the simple integration with python (with easy callbacks and error handling) will make the experience pretty painless.

David Raznick
Thanks for sharing your experiences. From initial, very limited testing of python-sqlparse, it seems that I might be able to use it. I will try to work with the returned value from the ``parse`` function in python-sqlparse. But I will look into pyparsing in any case.
codeape
Pyparsing is a good tool for this, with lots of examples of parsing sql around.
Gregg Lind
This poster on the pyparsing wiki (http://pyparsing.wikispaces.com/message/view/home/14105203) just reported completing a SQL SELECT parser - perhaps you could contact him/her for help, suggestions, or even the code.
Paul McGuire
TFTT. I have contacted the poster.
codeape
I implemented it using pyparsing. Pyparsing worked great for this.
codeape