views:

109

answers:

1

I need to send a JSON packet across the wire with the contents of an arbitrary file. This may be a binary file (like a ZIP file), but most often it will be plain ASCII text.

I'm currently using base64 encoding, which handles all files, but it increases the size of the data significantly - even if the file is ASCII to begin with. Is there a more efficient way I can encode the data, other than manually checking for any non-ASCII characters and then deciding whether or not to base64-encode it?

I'm currently writing this in Python, but will probably need to do the same in Java, C# and C++, so an easily portable solution would be preferable.

+2  A: 

Use quoted-printable encoding. Any language should support that. http://en.wikipedia.org/wiki/Quoted-printable

moshez
Interesting idea, thanks. QP definitely has less overhead for text data, but a lot more for binary data, so I'm not too sure. +1, anyway.
Evgeny
Quoted-printable is ok for text with occasional binary but for a binary file it will really balloon the size. Base-64 should be more efficient in that case. I would try to avoid passing the file via JSON, and instead would pass a URL to it and let the software at the other end retrieve the URL as a side process after decoding the JSON structure.
Greg
Another thing you can do (if what Greg suggests fails, for example for network topology reasons) is to use either QP or Base64, and put one character at the beginning to let the other side know which one it is. This is certainly easy enough, on both encoder and decoder...
moshez