Hello, I'm sure this question is not specific to django, but since I couldn't find any solution for my problem in other questions about python and encodings, I'm going to ask this. I need to add new features to existing website which is written in PHP using MySQL as backend. I inspected the database and created models for tables I am going to use. However, there is a problem with the existing data- half of it is in russian, and (at least it seems to me) it's in utf-8 encoding. When I show that data in django's admin, it doesn't appear right.
In [52]: p.name
Out[52]: u'\xd0\u02dc\xd0\xb3\xd0\xbe\xd1\u20ac\xd1\u0152 '
In [53]: repr(p.name)
Out[53]: "u'\\xd0\\u02dc\\xd0\\xb3\\xd0\\xbe\\xd1\\u20ac\\xd1\\u0152 '"
In django admin it displays like this:
Игорь
Encodings are still a little bit mythical for me, but if I understand this output correctly, basically those are utf-8 bytes in unicode object.
The question: is it possible to fix this in django's database layer? I'm going to update existing content in these tables, and I need the existing PHP front-end to be compatible with both the new data and old one.
When I add these database options data is displayed in admin correctly, however, I get UnicodeEncode error when saving something.
DATABASE_OPTIONS = {
'charset': 'latin1',
'use_unicode': False,
}
Name returned in this case is:
In [2]: p2.name
Out[2]: '\xd0\x9b\xd0\xae\xd0\xa1\xd0\xaf'
I checked with utf-8 character table, and those are correct characters for the data stored in that row.