ansaurus

Question

Answer 1

+1 A:

First make sure you're not confused about encodings, etc. Read, for example, this.

Then notice that the main problem isn't with the base64 encoding, but with the fact that you're trying to put byte string (normal string in Python 2.x) inside a Unicode string. I believe you can fix this by removing the "u" from the last string in your example code.

Amnon 2009-12-15 15:12:12

Thanks for the quick reply! That was a stupid mistake on my part. I changed that, and now the API says I should have used only ISO-8859-1 characters; I updated the question accordingly.

Agos 2009-12-15 15:15:30

You're welcome. But now you made all the previous answers irrelevant to the question.

Amnon 2009-12-15 15:20:09

Yes, I'm sorry about that, answers were just too fast!+1 one for the useful link

Agos 2009-12-15 22:54:43

Answer 2

+1 A:

base64.b64encode("Hi, %s! Your code is %s" % (data[0].decode('utf8').encode('latin1'), data[0]))

ʞɔıu 2009-12-15 15:13:53

This seems to work (also: duh for me). Another sub-question: it seems that accented characters should also be combined (instead of two entities like the example above).The accepted accented characters (ISO-8859-1 DEC) are 232, 233, 236, 242, 224.How can I convert accented characters in my string to the corresponding (accepted) values? (also: should I post this as a new question?)

Agos 2009-12-15 15:33:55

I believe that the two escaped values refer to two bytes that comprise a single character in utf8 (DEC 233). Recall that utf8 can use 1-4 bytes to represent a character (in contrast to older encodings like latin1 in which 1 character == 1 byte).

ʞɔıu 2009-12-15 16:05:25

You're right, in fact it gets escaped correctly to DEC 233. Why the XMLRPC still refuses it (since the manual says these codes are ok) is beyond me, and most importantly beyond the scope of this SO question.

Agos 2009-12-15 22:53:48

Answer 3

A:

This seem to work:

...

data = data2
base64.b64encode("Hi, %s! Your code is %s" % (data[0], data[0]))
# => 'SGksIERlc2lyw6khIFlvdXIgY29kZSBpcyBEZXNpcsOp'

# I can't test the XMLRPC parts, so this is just a hint ..
for_the_wire = base64.b64encode("Hi, %s! Your code is %s" % (data[0], data[0]))
latin_1_encoded = for_the_wire.encode('latin-1')

# send latin_1_encoded over the wire ..

Some python (2.X) unicode readings:

The MYYN 2009-12-15 15:15:16

ansaurus

tags:

views:

answers:

Encoding utf-8 to base64 with accents

related questions