views:

1167

answers:

4

Hi,

I am having issues with trying to convert an UTF-8 string to unicode. I get the error.

UnicodeEncodeError: 'ascii' codec can't encode characters in position 73-75: ordinal not in range(128)

I tried wrapping this in a try/except block but then google was giving me a system administrator error which was one line. Can someone suggest how to catch this error and continue.

Cheers, John.

-- FULL ERROR --

Traceback (most recent call last):
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 501, in __call__
    handler.get(*groups)
  File "/Users/johnb/Sites/hurl/hurl.py", line 153, in get
    self.redirect(url.long_url)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 371, in redirect
    self.response.headers['Location'] = str(absolute_url)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 73-75: ordinal not in range(128)
A: 

Try this:

self.response.headers['Location'] = absolute_url.decode("utf-8")
or
self.response.headers['Location'] = unicode(absolute_url, "utf-8")
zdmytriv
Sorry that didnt work.This is my current code. Because I am calling self.redirect the string is getting encoded and causing the error because in this case the URL actually has a "å" in it. If this error occurs, then I write the URL to page, and using the META-REFRESH tag, I make the browser do the redirect after a few secondsself.redirect(url.long_url)
John Ballinger
@zdmytriv: unicode(absolute_url)? Shouldn't UTF-8 get a mention somewhere?
John Machin
Fixed should work now
zdmytriv
A: 

Please edit that mess so that it's legible. Hint: use the "code block" (101010 thingy button).

You say that you are "trying to convert an UTF-8 string to unicode" but str(absolute_url) is a strange way of going about it. Are you sure that absolute_url is UTF-8? Try

print type(absolute_url)
print repr(absolute_url)

If it is UTF-8, you need absolute_url.decode('utf8')

John Machin
+3  A: 

The location header you are trying to set needs to be an Url, and an Url needs to be in Ascii. Since your Url is not an Ascii string you get the error. Just catching the error won't help since the Location header won't work with an invalid Url.

When you create absolute_url you need to make sure it is encoded properly, best by using urllib.quote and the strings encode() method. You can try this:

self.response.headers['Location'] = urllib.quote(absolute_url.encode('utf-8'))
sth
+2  A: 

The correct solution is to do the following:

self.response.headers['Location'] = urllib.quote(absolute_url.encode("utf-8"))
Alex Martelli
self.redirect(absolute_url.encode('utf-8'))
John Ballinger
The above comment is the correct code. Apologies for not quite asking the question correctly, I am not 100% exactly what type of data I have and hence full error code. Thanks for you help Alex. This should fix URL error for www.hurl.ws
John Ballinger
@John, well if you wanna redirect then you wanna redirect, I was just showing how to properly encode-and-quote a generic Unicode URL;-). As a curiosity, why did you accept this (thanks!) but not upvote it? That's peculiar by SO etiquette...!-)
Alex Martelli