I'm working in Django, and using urllib2 and simplejson to parse some information from an API.
The problem is that the API returns information in the Latin-1 encoding, and just once in a while there's a character in there that causes Django to crash horribly with an encoding error. This is my code:
get_person_id_url = "http://www.domain.com/api/get?" + \
"key=KEY&num="+ urllib2.quote(number) + "&always_return=true&output=js"
request = urllib2.Request(get_person_id_url, None, {'Referer': ''})
response = urllib2.urlopen(request)
results = json.load(response)
person_id = results["person_id"]
I know that I can use something like this to turn Latin1 strings into UTF8:
responseString = responseString.decode('latin1').encode('utf-8')
but it seems this only works on strings, so I'm not completely sure how or where to apply it in the above code. What should I decode and where to trap any errors before they occur?
Unfortunately I don't remember what API call to make to return a character that will crash Django - so I can't do much testing before it goes live. I'm hoping StackOverflow can help... Thanks!