tags:

views:

161

answers:

2

Hi all,

here is a fairly simple example wich is driving me nuts since a couple of days. Considering the following script:

# -*- coding: utf-8 -*
from json import dumps as json_dumps

machaine = u"une personne émérite"
print(machaine)

output = {}
output[1] = machaine
jsonoutput = json_dumps(output)
print(jsonoutput)

The result of this from cli:

une personne émérite
{"1": "une personne \u00e9m\u00e9rite"}

I don't understand why their such a difference between the two strings. i have been trying all sorts of encode, decode etc but i can't seem to be able to find the right way to do it. Does anybody has an idea ?

Thanks in advance. Matthieu

+2  A: 

The encoding is correct. Load it back in and print it, and you'll see the correct output:

>>> import json
>>> jsoninput = json.loads(jsonoutput)
>>> print jsoninput
{u'1': u'une personne \xe9m\xe9rite'}
>>> print jsoninput['1']
une personne émérite
Marcelo Cantos
thank you a billion times... you made my day :-)
+1  A: 

To clarify Marcelo Cantos's answer: json.dumps() returns a JSON-encoding, which is an ASCII string, starting with the character '{', and containing backslashes, quotes, etc. You have to decode it (e.g. with json.loads() to get back the actual dict with data.

# -*- coding: utf-8 -*
import json

output = {1: u"une personne émérite"}
print output[1]

json_encoded = json.dumps(output)
print "Encoded: %s" % repr(json_encoded)

input = json.loads(json_encoded)
print input['1']

outputs:

une personne émérite
Encoded: '{"1": "une personne \\u00e9m\\u00e9rite"}'
une personne émérite
DS
Out of curiosity, how would javascript deal with the encoded characters?
Carson Myers
JS would work correctly. E.g. save this as a file and load in your browser, you should see the correct string (at least I do): <html><body><script> x = eval('({"1": "une personne \\u00e9m\\u00e9rite"})'); document.write(x[1]); </script></body></html>
DS