views:

67

answers:

5

I'm using following model to store info about pages:

class Page(models.Model):
    title = models.TextField(blank = False, null = False)

New data saves correctly, I'm saving Unicode data there (lots of non-ASCII titles). But when I'm performing query:

page = Page.objects.filter(id = 1)

page.title looks odd:

u'\u042e\u0449\u0435\u043d\u043a\u043e'

What could I made wrong? Thanks.

Update: Really, when I'm print page.title - it looks OK. But I need to dump it to JSON, so after such code:

dumps({'title': page.title})

All looks bad.

Update 2: Thanks to everyone, pointed me that this behavoir is correct. But unicode-escaped strins are so long. Can I translate them to utf-8 somehow?

+1  A: 

Might just be your shell not being able to display unicode characters maybe?

What happens if you do print page.title?

Paul D. Waite
Printed string is OK, but I need to dump it to json
cleg
+2  A: 

That's perfectly fine. It's "Ющенко" unicode-escaped.

Max Shawabkeh
Is there any way to translate unicode-escaped string to normal?
cleg
You can use `s.encode('utf8')` to get a UTF8 representation of the string. Whether that is what you want depends on your client code.
Max Shawabkeh
I've tried all possible encodes-decodes combinations. I simply need to convert this representation to common unicode str.
cleg
That IS a common Unicode string. The slashes are just how Python unicode literals are written.
Max Shawabkeh
+3  A: 

You're doing nothing wrong. Have you tried printing it (or outputting it in a web page)?

In [1]: l = u'\u042e\u0449\u0435\u043d\u043a\u043e'

In [2]: print l
Ющенко
piquadrat
I don't need this string to be printed. That's my problem.
cleg
+1  A: 

There is no problem with what you have posted so far.

>>> print json.dumps(u'\u042e\u0449\u0435\u043d\u043a\u043e')
"\u042e\u0449\u0435\u043d\u043a\u043e"

Which is a correct JavaScript string literal. Assign that to a variable and you'll get Ющенко in a JavaScript string.

What is the actual problem? What “looks bad”?

bobince
Thanks! I thought that strings must be in UTF-8
cleg
+1  A: 

That's correct behaviour: dumps encodes the json for you. It looks ugly now, but that's just for transmission. To see your unicode string again have to decode it (usually on the other end):

>>> from django.utils.simplejson import dumps, loads
>>> original = u'\u042e\u0449\u0435\u043d\u043a\u043e'
>>> print original
Ющенко
>>> encoded = dumps(original)
>>> print encoded
"\u042e\u0449\u0435\u043d\u043a\u043e"
>>> decoded = loads(encoded)
>>> print decoded
Ющенко

Generally you won't need to decode it in python, it'll get loaded as a unicode string in javascript.

Will Hardy
Thaks. You're right.
cleg