views:

457

answers:

2

I am porting a Python application to Android and, at some point, this application has to communicate with a Web Service, sending it compressed data.

In order to do that it uses the next method:

def stuff(self, data):
    "Convert into UTF-8 and compress."
    return zlib.compress(simplejson.dumps(data))

I am using the next method to try to emulate this behavior in Android:

private String compressString(String stringToCompress)
{
    Log.i(TAG, "Compressing String " + stringToCompress);
    byte[] input = stringToCompress.getBytes(); 
    // Create the compressor with highest level of compression 
    Deflater compressor = new Deflater(); 
    //compressor.setLevel(Deflater.BEST_COMPRESSION); 
    // Give the compressor the data to compress 
    compressor.setInput(input); 
    compressor.finish(); 
    // Create an expandable byte array to hold the compressed data. 
    // You cannot use an array that's the same size as the orginal because 
    // there is no guarantee that the compressed data will be smaller than 
    // the uncompressed data. 
    ByteArrayOutputStream bos = new ByteArrayOutputStream(input.length); 
    // Compress the data 
    byte[] buf = new byte[1024]; 
    while (!compressor.finished()) 
    { 
        int count = compressor.deflate(buf); 
        bos.write(buf, 0, count); 
    } 

    try { 
        bos.close(); 
    } catch (IOException e) 
    { 

    } 
    // Get the compressed data 
    byte[] compressedData = bos.toByteArray(); 

    Log.i(TAG, "Finished to compress string " + stringToCompress);

    return new String(compressedData);
}

But the HTTP response from the server is not correct and I guess it is because the result of the compression in Java is not the same as the one in Python.

I ran a little test compressing "a" both with zlib.compress and deflate.

Python, zlib.compress() -> x%9CSJT%02%00%01M%00%A6

Android, Deflater.deflate -> H%EF%BF%BDK%04%00%00b%00b

How should I compress the data in Android to obtain the same value of zlib.compress() in Python?

Any help, guidance or pointer is greatly appreciated!

A: 

Does byte[] input = stringToCompress.getBytes("utf-8"); help? In case your platform's default encoding is not UTF-8, this will force the encoding String -> bytes to use UTF-8. Also, the same goes for the last line of your code where you create a new String - you may want to explicitly specify UTF-8 as the decoding Charset.

Thomas
Thank you for your suggestion! I am going to try it and tell you how it goes and although I think the default encoding is already UTF-8 is always good to be cautious.
Edu Zamora
I test that suggestion but did not change the outcome. Thank you anyway!
Edu Zamora
+3  A: 

compress and deflate are different compression algorithms so the answer is they will not be compatible. As an example of the difference here is 'a' compressed using the two algorithms via Tcl:

% binary encode hex [zlib compress a]
789c4b040000620062
% binary encode hex [zlib deflate a]
4b0400

Your python code is indeed doing compress. And the android code is doing deflate, however you are also getting the UTF-8 byte order mark prepended to the android version (\xef\xbf\xbf)

You can emit deflate data using python:

def deflate(data):
    zobj = zlib.compressobj(6,zlib.DEFLATED,-zlib.MAX_WBITS,zlib.DEF_MEM_LEVEL,0)
    zdata = zobj.compress(data)
    zdata += zobj.flush()
    return zdata
>>> deflate("a")
'K\x04\x00'
patthoyts
That is what I suspected, that they were in fact different compression methods. Now I can focus to look for a way to generate the same compression of zlib.compress in Java (I tested your Python code and it indeed deflates data like the Java version, but since I am porting the application to Android and I can not modify the original one, I was looking for the other way around: doing compress on Java).Anyway, really helpful answer so far! I would vote you up if I could :P
Edu Zamora
+1 (Now that I can)
Edu Zamora