I am using Twisted to asynchronously access our database in Python. My code looks like this:
from twisted.enterprise import adbapi
from MySQLdb import _mysql as mysql
...
txn.execute("""
INSERT INTO users_accounts_data_snapshots (accountid, programid, fieldid, value, timestamp, jobid)
VALUES ('%s', '%s', '%s', '%s', '%s', '%s')
""" % (accountid, programid, record, mysql.escape_string(newrecordslist[record]), ended, jobid))
This worked until I came across this character: ®, which caused the thread to throw an exception: `exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 7: ordinal not in range(128)
However, if I don't use MySQLdb_mysql.escape_string(), I get database errors when input contains quotes etc (of course). The exception is occurring before the database is accessed so the collation of the database doesn't seem to matter at all.
What's the best way to escape this content without throwing exceptions on unicode characters? The ideal solution is one where I can pass unicode characters that won't interfere with the query along to MySQL unmolested; however, stripping the string of unicode characters, replacing them with question marks, mangling them or anything else that will stop the crashes would be acceptable.