views:

147

answers:

2

Hey,

I have sqlite database which I would like to insert values in Hebrew to

I am keep getting the following error :

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal
not in range(128)

my code is as following :

runsql(u'INSERT into personal values(%(ID)d,%(name)s)' % {'ID':1,'name':fabricate_hebrew_name()})

    def fabricate_hebrew_name():
        hebrew_names = [u'ירדן',u'יפה',u'תמי',u'ענת',u'רבקה',u'טלי',u'גינה',u'דנה',u'ימית',u'אלונה',u'אילן',u'אדם',u'חווה']
        return random.sample(names,1)[0].encode('utf-8')

note: runsql executing the query on the sqlite database fabricate_hebrew_name() should return a string which could be used in my SQL query. any help is much appreciated.

+2  A: 

You should not encode manually, and you should not use string interpolation for queries.

Ignacio Vazquez-Abrams
+3  A: 

You are passing the fabricated names into the string formatting parameter for a Unicode string. Ideally, the strings passed this way should also be Unicode.

But fabricate_hebrew_name isn't returning Unicode - it is returned UTF-8 encoded string, which isn't the same.

So, get rid of the call the encode('utf-8') and see whether that helps.

The next question is what type runsql is expecting. If it is expecting Unicode, no problem. If it is expecting an ASCII-encoded string, then you will have problems because the Hebrew is not ASCII. In the unlikely case it is expecting a UTF-8 encoded-string, then that is the time to convert it - after the substitution is done.

In another answer, Ignacio Vazquez-Abrams warns against string interpolation in queries. The concept here is that instead of doing the string substitution, using the % operator, you should generally use a parameterised query, and pass the Hebrew strings as parameters to it. This may have some advantages in query optimisation and security against SQL injection.

Example

# -*- coding: utf-8 -*-
import sqlite3

# create db in memory
conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.execute("CREATE TABLE personal ("
            "id INTEGER PRIMARY KEY,"
            "name VARCHAR(42) NOT NULL)")

# insert random name
import random
fabricate_hebrew_name = lambda: random.choice([
    u'ירדן',u'יפה',u'תמי',u'ענת', u'רבקה',u'טלי',u'גינה',u'דנה',u'ימית',
    u'אלונה',u'אילן',u'אדם',u'חווה'])

cur.execute("INSERT INTO personal VALUES("
            "NULL, :name)", dict(name=fabricate_hebrew_name()))
conn.commit()

id, name = cur.execute("SELECT * FROM personal").fetchone()
print id, name
# -> 1 אלונה
Oddthinking
Hey Thanks for the answer it helped me a lot and my problem is now solved :) also I got to understand a bit more about the idea of this hebrew string types.
I've added code example
J.F. Sebastian
Thanks, J.F. I feel you and Ignacio deserve the lion's share of the rep here.
Oddthinking