tags:

views:

329

answers:

3

i've got some weird characters in my database, which seem to mess up django when returning a page. I get this error come up:

TemplateSyntaxError at /search/legacy/

Caught an exception while rendering: Could not decode to UTF-8 column 'maker' with text 'i� G�r'

(the actual text is slightly different, but since it is a company name i've changed it)

how can i get django to output this text? i'm currently running the site from sqlite (fast dev), is this the issue?

Also, on a completely unrelated note, is it possible to use a database view?

thanks

+1  A: 

Probably not.

Django is using UTF-8 Strings internally, and it seems that your database returns some invalid string. You should fix the data in the database and use exclusively UTF-8 in all your application (data import, database, templates, source files, ...).

Guillaume
I'm not an expert on encoding issues, but I'm pretty sure this is wrong. UTF-8 is an encoding. Django uses native _unicode_ strings internally.
Carl Meyer
@Carl: "Django will always assume UTF-8 encoding for internal bytestrings." -- Django manual: http://docs.djangoproject.com/en/dev/ref/unicode/#ref-unicode
R. Bemrose
Note that my previous statements isn't saying that Django uses UTF-8 internally, just that it does understand UTF-8 strings, and they are in fact recommended from external sources (such as databases and templates).
R. Bemrose
sorting out my weird data seems like the correct (and time consuming) answer. Thanks for the help guys
A: 

I have a related problem with a site owner who uses Apple's iPages for article creation, then does a copy-paste into a Django admin textbox. This process creates 'funny characters' that screw up Django and/or MySQL (you wouldn't believe the number of different double-left/right quote characters there are). I can't 'fix' the customer so I have a function that looks for known strangeness and translates it to something useful before. A complete PITA.

Peter Rowell
A: 

That's a bit of a confusing error message, and without knowing more details I'm not clear what the source of the problem is (the error message phrasing "decode to UTF-8" seems wrong, as normally you would encode to UTF-8). Perhaps Django is expecting to find data in some other encoding and is trying to decode it and re-encode as UTF-8, but is choking on some characters that aren't valid for the encoding it's expecting?

In general, you want to make sure that you're storing UTF-8 in your database, and that internally you're using unicode objects (not str objects) everywhere in your code.

Some other reading that may be helpful:

Carl Meyer