tags:

views:

172

answers:

3

I'm working on a small app which will help browse the data generated by vim-logging, and I'd like to allow people to run arbitrary SQL queries against the datasets.

How can I do that safely?

For example, I'd like to let someone enter, say, SELECT file_type, count(*) FROM commands GROUP BY file_type, then send the result back to their web browser.

+3  A: 

Allowing expressive power while preventing destruction is a difficult job. If you let them enter "SELECT .." themselves, you need to prevent them from entering "DELETE .." instead. You can require the statement to begin with "SELECT", but then you also have to be sure it doesn't contain "; DELETE" somewhere in the middle.

The safest thing to do might be to connect to the database with read-only user credentials.

Ned Batchelder
That's true… Although using something like [sqlparse](http://code.google.com/p/python-sqlparse/) could help. The difficulty with sqlparse is that I'm not sufficiently confident with my sql-foo to ensure that there are no corner-cases I'm missing.
David Wolever
SQL parsing is hard! sqlparse is incomplete and could probably be fooled quite easily with non-standard constructs like MySQL's wrong string literal escapes and comments.
bobince
Ah, yes - I guess that I'd been assuming some sort of "strict" mode… But as you say, that's probably a pipe dream.
David Wolever
+2  A: 

In MySQL, you can create a limited user (create new user and grant limited access), which can only access certain table.

S.Mark
Whilst this is about as safe as you can manage, an untrusted user could still (accidentally or deliberately) create a query such as a massive cross-join that could effectively be a denial of service attack!
bobince
Yeah bobince, very true!
S.Mark
That's true, but much easier to mitigate - simply setting a timeout and only allowing a couple of concurrent queries would be all that's needed.
David Wolever
A: 

Consider using SQLAlchemy. While SQLAlchemy is arguably the greatest Object Relational Mapper ever, you certainly don't need to use any of the ORM stuff to take advantage of all of the great Python/SQL work that's been done.

As the introductory documentation suggests:

Most importantly, SQLAlchemy is not just an ORM. Its data abstraction layer allows construction and manipulation of SQL expressions in a platform agnostic way, and offers easy to use and superfast result objects, as well as table creation and schema reflection utilities. No object relational mapping whatsoever is involved until you import the orm package. Or use SQLAlchemy to write your own!

Using SQLAlchemy will give you input sanitation "for free" and let you use standard Python logic to analyze statements for safety without having to do any messy text-parsing/pattern-matching.

Travis Bradshaw
I'm not entirely sure I understand what you're suggesting... Could you give an example of how SA could be used to solve my problem?
David Wolever
Sure, the idea is that if you use the SQL Expression tools in SQLAlchemy, then you are both platform agnostic and completely immune to things like SQL injection. That's how I interpreted your question "how can I do this safely?" Not how to limit the scope of SQL to read-only access, but how to avoid unexpected inputs like malformed SQL statements that are exploits or malicious (outside of simple "delete" statements).
Travis Bradshaw
It's also a question of interface. If you're wanting the user to be able to "just type SQL", then the only real safety will be the permission model provided by the RDBMS. But if you're free to implement your query system however you want, you can use an abstraction layer to remove some of the tedium of SQL generation and replace it with more Pythonic programming.
Travis Bradshaw