views:

188

answers:

1

I'm working in Django, and using urllib2 and simplejson to parse some information from an API.

The problem is that the API returns information in the Latin-1 encoding, and just once in a while there's a character in there that causes Django to crash horribly with an encoding error. This is my code:

get_person_id_url = "http://www.domain.com/api/get?" + \
    "key=KEY&num="+ urllib2.quote(number) + "&always_return=true&output=js"
request = urllib2.Request(get_person_id_url, None, {'Referer': ''})
response = urllib2.urlopen(request)
results = json.load(response)
person_id = results["person_id"]

I know that I can use something like this to turn Latin1 strings into UTF8:

responseString = responseString.decode('latin1').encode('utf-8')

but it seems this only works on strings, so I'm not completely sure how or where to apply it in the above code. What should I decode and where to trap any errors before they occur?

Unfortunately I don't remember what API call to make to return a character that will crash Django - so I can't do much testing before it goes live. I'm hoping StackOverflow can help... Thanks!

A: 

Call response.read() to get the data of the response. Then in a try/except do your latin1 decoding and json loading. In general, Django should never crash if you wrap potentially error-causing operations in exception handlers and take care of them appropriately (at least log them somewhere so you can deal with them at some point in the future).

rlotun
Thanks. Can I do something like "response_string = response.read() (do the decoding and encoding) results=json.load(response_string)", or will I need to turn response_string back into an object that simplejson can handle? That was partly my question - how to get it back into simplejson, which has very limited documentation.
AP257
`json.loads` simple takes a string (in Json format). So, for example if your Python string was `s = '{"a": 1}'` then you could do `d = json.loads(s)` and get back a Python dict. Likewise, `s = json.dumps(d)` would produce a json encoded string. I (and I suspect most people) only use the `dumps` and `loads` methods.
rlotun