Can unicode characters be en/decoded with base64?
I have attempted to encode the following string: الله but when I decoded it all I got was '????'
views:
1009answers:
4
+2
A:
Of course they can. Depends on how your language or base64 routine handles unicode input. For example, python's b64 routines expect an encoded string (as base64 encodes binary to text, not unicode codepoints to text)
Python 2.5.1 (r251:54863, Jul 31 2008, 22:53:39) [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = 'ûñö' >>> import base64 >>> base64.b64encode(a) 'w7vDscO2' >>> base64.b64decode('w7vDscO2') '\xc3\xbb\xc3\xb1\xc3\xb6' >>> print '\xc3\xbb\xc3\xb1\xc3\xb6' ûñö >>> >>> u'üñô' u'\xfc\xf1\xf4' >>> base64.b64encode(u'\xfc\xf1\xf4') Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.5/base64.py", line 53, in b64encode encoded = binascii.b2a_base64(s)[:-1] UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) >>> base64.b64encode(u'\xfc\xf1\xf4'.encode('utf-8')) 'w7zDscO0' >>> base64.b64decode('w7zDscO0') '\xc3\xbc\xc3\xb1\xc3\xb4' >>> print base64.b64decode('w7zDscO0') üñô >>> a = 'الله' >>> a '\xd8\xa7\xd9\x84\xd9\x84\xd9\x87' >>> base64.b64encode(a) '2KfZhNmE2Yc=' >>> b = base64.b64encode(a) >>> print base64.b64decode(b) الله
Vinko Vrsalovic
2008-11-20 12:36:20
+1 for examples
2010-01-11 05:06:04
I'd just note that the returned string is not a unicode object.it should be decoded as follows:c = base64.b64decode(b).decode('utf-8')
DanJ
2010-07-06 06:59:38
+4
A:
Base64 converts binary to text. If you want to convert text to a base64 format, you'll need to convert the text to binary using some appropriate encoding (e.g. UTF-8, UTF-16) first.
Jon Skeet
2008-11-20 12:40:16
+1
A:
You didn't specify which language(s) you're using, but try converting the string to a byte array (however that's done in your language of choice) and then base64 encoding that byte array.
joel.neely
2008-11-20 13:04:24
A:
In .NET you can try this (encode):
byte[] encbuf;
encbuf = System.Text.Encoding.Unicode.GetBytes(input);
string encoded = Convert.ToBase64String(encbuf);
...and to decode:
byte[] decbuff;
decbuff = Convert.FromBase64String(this.ToString());
string decoded = System.Text.Encoding.Unicode.GetString(decbuff);
Scott Whitlock
2010-03-28 19:48:50