I'd like to include a large compressed string in a json packet, but am having some difficulty.
import json,bz2
myString = "A very large string"
zString = bz2.compress(myString)
json.dumps({ 'compressedData' : zString })
which will result in a
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 10-13: invalid data
An obvious solution is bz2'ing the entire json structure, but let's just assume I'm using a blackbox api that does the json encoding and it wants a dict.
Also, I'm just using bz2 as an example, I don't really care what the actual algorithm is though I noticed the same behavior with zlib.
I can understand why these two compression libraries wouldn't create utf-8 compatible output, but is there any solution that can effectively compress utf-8 strings? This page seemed like a gold mine http://unicode.org/faq/compression.html but I couldn't find any relevant python information.