views:

91

answers:

3

I'm using pylons, and some of my urls contains non-English characters, such as:

http://localhost:5000/article/111/文章标题

At most cases, it won't be a problem, but in my login module, after a user has logging out, I try to get the referer from the request.headers, and redirect to that url.

if user_logout:
    referer = request.headers.get('referer', '/')
    redirect(referer)

Unforunately, if the url contains non-English characters, and with a brower of IE, it will report such an error (Firefox is OK):

  WebError Traceback:
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xd5 in position 140: ordinal not in range(128) 
View as:   Interactive (full)  |  Text (full)  |  XML (full) clear this 
clear this 
URL: http://localhost:5000/users/logout

Module weberror.evalexception:431 in respond          view

There is a way to fix it(but no good), use urllib.quote() to convert the url before redirecting.

referer = quote_path(url) # only quote the path of the url
redirect(referer)

This is not a good solution, because it only works if the brower is IE, and very boring. Is there any good solution?

A: 

Redirect works by raising an exception. This is caught and converted into an HTTP response How about specifying the charset for your response?

response.charset = 'utf8'

pyfunc
The bad news is: I have set `response.charset='utf8'`, even `request.charset='utf8'`
Freewind
That means the url itself is not getting encoded correctly.The fix you have provided can be extended.referer = quote_plus(url.encode('utf8'))This should enode the non-characters and then quoting the url should work.
pyfunc
+1  A: 

Try checking the RFC for non-ascii URLs. They are converted to an ascii equivalent if I remember correctly. You could then redirect to that.

Edit: According to @ssokolov (see comments below):

The specific terms to look up are IDN (Internationalized Domain Names) and Punycode

Daren Thomas
This does this conversion:referer = quote_path(url)
pyfunc
The specific terms to look up are IDN (Internationalized Domain Names) and Punycode.
ssokolow
A: 

At last, I still not find a good solution, and use this code:

referer = urllib.quote(referer, '.:/?=;-%#')

It seems work well now, but I don't feel safe.

Freewind
This is the right way of doing it, you should encode every non ASCII character in URL. It's in HTTP specification.
giolekva