views:

110

answers:

2

I have some binary data produced as base-256 bytestrings in Python (2.x). I need to read these into JavaScript, preserving the ordinal value of each byte (char) in the string. If you'll allow me to mix languages, I want to encode a string s in Python such that ord(s[i]) == s.charCodeAt(i) after I've read it back into JavaScript.

The cleanest way to do this seems to be to serialize my Python strings to JSON. However, json.dump doesn't like my bytestrings, despite fiddling with the ensure_ascii and encoding parameters. Is there a way to encode bytestrings to Unicode strings that preserves ordinal character values? Otherwise I think I need to encode the characters above the ASCII range into JSON-style \u1234 escapes; but a codec like this does not seem to be among Python's codecs.

Is there an easy way to serialize Python bytestrings to JSON, preserving char values, or do I need to write my own encoder?

A: 

Could you just use Base64? (Python base64 module, Javascript has several implementations, one of which is here.)

No reason to use escaped ASCII or UTF-8 unless your data is almost all text.

Mike DeSimone
+2  A: 

Is there a way to encode bytestrings to Unicode strings that preserves ordinal character values?

The byte -> unicode transformation is called decode, not encode. But yes, decoding with a codec such as iso-8859-1 should indeed "preserve ordinal character values" as you wish.

Alex Martelli
Well I'll be; that was pretty simple. Thanks!
Doctor J