views:

52

answers:

2

I am using Twisted to asynchronously access our database in Python. My code looks like this:

from twisted.enterprise import adbapi
from MySQLdb import _mysql as mysql

...

txn.execute("""
    INSERT INTO users_accounts_data_snapshots (accountid, programid, fieldid, value, timestamp, jobid)
    VALUES ('%s', '%s', '%s', '%s', '%s', '%s')
""" % (accountid, programid, record, mysql.escape_string(newrecordslist[record]), ended, jobid))

This worked until I came across this character: ®, which caused the thread to throw an exception: `exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 7: ordinal not in range(128)

However, if I don't use MySQLdb_mysql.escape_string(), I get database errors when input contains quotes etc (of course). The exception is occurring before the database is accessed so the collation of the database doesn't seem to matter at all.

What's the best way to escape this content without throwing exceptions on unicode characters? The ideal solution is one where I can pass unicode characters that won't interfere with the query along to MySQL unmolested; however, stripping the string of unicode characters, replacing them with question marks, mangling them or anything else that will stop the crashes would be acceptable.

+4  A: 

Do not format strings like this. It is a massive security hole. It is not possible to do the quoting correctly by yourself. Do not try.

Use the second parameter to 'execute'. Simply put, instead of txn.execute("... %s, %s ..." % ("xxx", "yyy")), do txn.execute("... %s, %s ...", ("xxx", "yyy")). Notice the comma instead of the percent sign. In other databases or with a different database binding, you might use a different character instead of "%s", like ? or :1, :2, :3 or :foo:, :bar:, :baz: but the idea is the same. (You can see the documentation for paramstyle in the DB-API 2.0 documentation if you are curious about alternatives.)

I've written about this in the past. The discussion on that post may be of particular interest to you.

Please also let me emphasize that this is the only correct way to do it. You may have seen MySQL documentation talking about quoting strings in various ways. You may have written applications in PHP which lacks a proper facility for passing database parameters. I guarantee that all of these sources of information are incorrect and lead to serious and ongoing security problems: do not interpolate parameters into your SQL strings.

Glyph
Not only does this work, but now I don't have to squash datetime objects to strings manually. Thank you very much.
Andrew G.
A: 

You can try:

newrecordslist[record].decode("utf-8")

Glyph is right about http://www.python.org/dev/peps/pep-0249/.

iddqd