views:

316

answers:

4

I have a text-box which allows users to enter a word.

The user enters: über

In the backend, I get the word like this:

def form_process(request):
    word = request.GET.get('the_word')
    word = word.encode('utf-8')
    #word = word.decode('utf-8')
    print word

For some reason, I cannot decode or encode this!! It gives me the error:

 UnicodeEncodeError
 ('ascii', u'\ufffd', 0, 1, 'ordinal not in range(128)')

Edit: When I do "repr(word)", this is what I get:

u'\ufffd'
+1  A: 

Did you remember to put:

accept-charset="utf-8"

in the form tag?

EDIT: Is the DEFAULT_CHARSET in settings.py set to 'utf-8' ?

Gabriel Ross
I added this, but it still has the same issue.
TIMEX
Don't use `accept-charset`. IE doesn't support it properly. You will continue to get strings encoded in the page's specified/guessed encoding; only characters that can't be encoded in that charset end up using UTF-8. And IE doesn't tell you which encoding it has used. If you want reliable UTF-8 form submissions you must specify the charset of the page containing the form to be UTF-8, using a Content-Type parameter or meta tag equivalent.
bobince
A: 

Previous post on this subject

"über".decode('cp1252').encode('utf-8')
'\xc3\xbcber'
sberry2A
+1  A: 

Solved!

I had escape(word) ...in the javascript ...before I passed it to the server.

TIMEX
That would certainly mess things up. For constructing query strings from text you need `encodeURIComponent`. `escape` should never be used.
bobince
A: 

Is there any reason to use print word? If not, its should work without those lines.

def form_process(request):
    word = request.GET.get('the_word')
S.Mark