Sanitizing user-provided SQL with Python?

tags:

python
sql

views:

172

answers:

+2 Q:

Sanitizing user-provided SQL with Python?

I'm working on a small app which will help browse the data generated by vim-logging, and I'd like to allow people to run arbitrary SQL queries against the datasets.

How can I do that safely?

For example, I'd like to let someone enter, say, SELECT file_type, count(*) FROM commands GROUP BY file_type, then send the result back to their web browser.

+3 A:

Allowing expressive power while preventing destruction is a difficult job. If you let them enter "SELECT .." themselves, you need to prevent them from entering "DELETE .." instead. You can require the statement to begin with "SELECT", but then you also have to be sure it doesn't contain "; DELETE" somewhere in the middle.

The safest thing to do might be to connect to the database with read-only user credentials.

Ned Batchelder 2010-01-04 04:16:48

That's true… Although using something like [sqlparse](http://code.google.com/p/python-sqlparse/) could help. The difficulty with sqlparse is that I'm not sufficiently confident with my sql-foo to ensure that there are no corner-cases I'm missing.

David Wolever 2010-01-04 06:02:34

SQL parsing is hard! sqlparse is incomplete and could probably be fooled quite easily with non-standard constructs like MySQL's wrong string literal escapes and comments.

bobince 2010-01-04 14:59:58

Ah, yes - I guess that I'd been assuming some sort of "strict" mode… But as you say, that's probably a pipe dream.

David Wolever 2010-01-04 15:51:02

+2 A:

In MySQL, you can create a limited user (create new user and grant limited access), which can only access certain table.

S.Mark 2010-01-04 04:18:10

Whilst this is about as safe as you can manage, an untrusted user could still (accidentally or deliberately) create a query such as a massive cross-join that could effectively be a denial of service attack!

bobince 2010-01-04 15:03:38

Yeah bobince, very true!

S.Mark 2010-01-04 15:22:26

That's true, but much easier to mitigate - simply setting a timeout and only allowing a couple of concurrent queries would be all that's needed.

David Wolever 2010-01-04 15:50:14

Consider using SQLAlchemy. While SQLAlchemy is arguably the greatest Object Relational Mapper ever, you certainly don't need to use any of the ORM stuff to take advantage of all of the great Python/SQL work that's been done.

As the introductory documentation suggests:

Most importantly, SQLAlchemy is not just an ORM. Its data abstraction layer allows construction and manipulation of SQL expressions in a platform agnostic way, and offers easy to use and superfast result objects, as well as table creation and schema reflection utilities. No object relational mapping whatsoever is involved until you import the orm package. Or use SQLAlchemy to write your own!

Using SQLAlchemy will give you input sanitation "for free" and let you use standard Python logic to analyze statements for safety without having to do any messy text-parsing/pattern-matching.

Travis Bradshaw 2010-01-04 04:38:25

I'm not entirely sure I understand what you're suggesting... Could you give an example of how SA could be used to solve my problem?

David Wolever 2010-01-04 05:59:46

Sure, the idea is that if you use the SQL Expression tools in SQLAlchemy, then you are both platform agnostic and completely immune to things like SQL injection. That's how I interpreted your question "how can I do this safely?" Not how to limit the scope of SQL to read-only access, but how to avoid unexpected inputs like malformed SQL statements that are exploits or malicious (outside of simple "delete" statements).

Travis Bradshaw 2010-01-05 18:37:31

It's also a question of interface. If you're wanting the user to be able to "just type SQL", then the only real safety will be the permission model provided by the RDBMS. But if you're free to implement your query system however you want, you can use an abstraction layer to remove some of the tedium of SQL generation and replace it with more Pythonic programming.

Travis Bradshaw 2010-01-05 18:39:34

ansaurus

tags:

views:

answers:

Sanitizing user-provided SQL with Python?

related questions