views:

65

answers:

3

I have this:

a = {'album': u'Metamorphine', 'group': 'monoku', 'name': u'Son Of Venus (Danny\xb4s Song)', 'artist': u'Leandra', 'checksum': '2836e33d42baf947e8c8adef48921f2f76fcb37eea9c50b0b59d7651', 'track_number': 8, 'year': '2008', 'genre': 'Darkwave', 'path': u'/media/data/musik/Leandra/2008. Metamorphine/08. Son Of Venus (Danny\xb4s Song).mp3', 'user_email': '[email protected]', 'size': 6624104}
data = urllib.urlencode(mp3_data)

And that raise an exception:

Traceback (most recent call last):
  File "playkud.py", line 44, in <module>
    main()
  File "playkud.py", line 29, in main
    craw(args, options.user_email, options.group)
  File "/home/diegueus9/workspace/playku/src/client/playkud/crawler/crawler.py", line 76, in craw
    index(root, file, data, user_email, group)
  File "/home/diegueus9/workspace/playku/src/client/playkud/crawler/crawler.py", line 58, in index
    done = add_song(data[mp3file])
  File "/home/diegueus9/workspace/playku/src/client/playkud/service.py", line 32, in add_song
    return make_request(URL+'add_song/', data)
  File "/home/diegueus9/workspace/playku/src/client/playkud/service.py", line 14, in make_request
    data = urllib.urlencode(dict([k.encode('utf-8'),v] for k,v in mp3_data.items()))
  File "/usr/lib/python2.5/urllib.py", line 1250, in urlencode
    v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb4' in position 19: ordinal not in range(128)

and with ipython (2.5):

In [7]: urllib.urlencode(a) UnicodeEncodeError Traceback (most recent call last)

/home/diegueus9/ in ()

/usr/lib/python2.5/urllib.pyc in urlencode(query, doseq) 1248 for k, v in query: 1249 k = quote_plus(str(k)) -> 1250 v = quote_plus(str(v)) 1251 l.append(k + '=' + v) 1252 else:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xb4' in position 19: ordinal not in range(128)

How i can fix it?

+2  A: 

hello, the problem is, that you want to cast a unicode-string to a string, but there are some characters that have to be converted to ASCII first. So I would advice you to search for strings that are not ASCII and then encode them as follows:

try to change for example where v is a unicode-string to:

quote_plus(str(v))

to

quote_plus(str(v.encode("utf-8")))

that should help


If you do not have to use Python 2.x, you could switch to Python 3.x, where all strings are unicode by default. But you have to convert some things for it (you could automate this party or full with 2to3).

Joschua
I think that, but is a little nasty because is in the core of python :/
diegueus9
hmm.. I don't think it's the urllib's fault. maybie there's anywhere a string, that's not encoded to ASCII. Can you search for that, or provide more code?
Joschua
if you see the traceback you can read the exception is raised in /usr/lib/python2.5/urllib.pyc
diegueus9
by the way, i can't use python 3 :(
diegueus9
+3  A: 

The urlencode library expects data in str format, and doesn't deal well with Unicode data since it doesn't provide a way to specify an encoding. Try this instead:

mp3_data = {'album': u'Metamorphine',
     'group': 'monoku',
     'name': u'Son Of Venus (Danny\xb4s Song)',
     'artist': u'Leandra',
     'checksum': '2836e33d42baf947e8c8adef48921f2f76fcb37eea9c50b0b59d7651',
     'track_number': 8,
     'year': '2008', 'genre': 'Darkwave',
     'path': u'/media/data/musik/Leandra/2008. Metamorphine/08. Son Of Venus (Danny\xb4s Song).mp3',
     'user_email': '[email protected]',
     'size': 6624104}

str_mp3_data = {}
for k, v in mp3_data.iteritems():
    str_mp3_data[k] = unicode(v).encode('utf-8')
data = urllib.urlencode(str_mp3_data)

What I did was ensure that all data is encoded into str using UTF-8 before passing the dictionary into the urlencode function.

Walter Mundt
+1  A: 

The problem is that some of the values in your mp3_data dict are unicode strings that can't be represented in the default encoding used by urlencode() (while others are ASCII and still others are integers). You can fix this by encoding those values before passing them to urlencode(). On line 14 of /home/diegueus9/workspace/playku/src/client/playkud/service.py, in make_request(), try changing this:

data = urllib.urlencode(dict([k.encode('utf-8'),v] for k,v in mp3_data.items()))

to this:

data = urllib.urlencode(dict([k.encode('utf-8'),unicode(v).encode('utf-8')] for k,v in mp3_data.items()))
Forest
not all the values are strings, this doesn't work
diegueus9
They don't have to be strings for this to work. Note my use of the unicode() call to convert any integers or plain strings to unicode before encoding as utf-8. If it really doesn't work, I'd be interested in seeing what the failure looks like (Are you sure you replied to the right answer?)
Forest