views:

109

answers:

3

Ok, so I've got an Open Source Java client/server program that uses packets to communicate. I'm trying to write a python client for said program, but the contents of the packet seem to be compressed. A quick perusal through the source code suggested gzip as the compression schema (since that was the only compression module imported in the code that I could find), but when I saved the data from one of the packets out of wireshark and tried to do

import gzip
f = gzip.open('compressed_file')
f.read()

It told me that this wasn't a gzip file because the header was wrong. Can someone advise me what I've done wrong here? Did I change or mess up the format when I saved it out? Do I need to strip away some of the extraneous data from the packet before I try running this block on it?

    if (zipped) {

        // XML encode the data and GZIP it.
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        Writer zipOut = new BufferedWriter(new OutputStreamWriter(
                new GZIPOutputStream(baos)));
        PacketEncoder.encodeData(packet, zipOut);
        zipOut.close();

        // Base64 encode the commpressed data.
        // Please note, I couldn't get anything other than a
        // straight stream-to-stream encoding to work.
        byte[] zipData = baos.toByteArray();
        ByteArrayOutputStream base64 = new ByteArrayOutputStream(
                (4 * zipData.length + 2) / 3);
        Base64.encode(new ByteArrayInputStream(zipData), base64, false);

EDIT: Ok, sorry I have the information requested here. This was gathered using Wireshark to listen in on communication between two running copies of the original program on different computers. To get the hex stream below, I used the "Copy -> Hex (Byte Stream)" option in Wireshark.

001321cdc68ff4ce46e4f00d0800450000832a85400080061e51ac102cceac102cb004f8092a9909b32c10e81cb25018f734823e00000100000000000000521f8b08000000000000005bf39681b59c85818121a0b4884138da272bb12c512f27312f5dcf3f292b35b9c47ac2b988f902c59a394c0c0c150540758c250c5c2ea5b9b9950a2e89258900aa4c201a3f000000

I know this will contain the string "Dummy Data" in it. I believe it should also contain "Jonathanb" (the player name I used to send the message) and the integer 80 (80 is the command # for "Chat" as far as I can gather from the code).

+1  A: 

You could try using standard library module zlib directly -- that's what gzip uses for the compress/decompress part. If the whole packet isn't liked by the decompress function, you can try using different values of wbits and/or slicing off a few bytes off the packet's front (if you could "reverse engineer" exactly how the Java code is compressing that packet -- even just understand how many wbits is using, or whether it's putting out any prefix before the compressed data -- that would help immensely, of course).

The only likely "damage" you might have done to the file itself would be, on windows, if you had written it without specifying 'wb' to use binary mode -- writing it in "text mode" on windows would make the file unusable. Just saying...!-)

Alex Martelli
A: 

It's likely to be compliant with one of RFC 1950, 1951, or 1952.

Since the name is GZIP, I'd first check 1952. Then I'd try ZLIB, 1950. Finally, DEFLATE(1951).

DotNetZip is a .NET library that allows a .NET app to read data streams that comply with any of these formats. If you had a stream that complied with one of the above, you could very quickly determine which one it was, by trying to read the stream with each of DotNetZip's streams in succession; GZipStream, ZlibStream, DeflateStream. One of them will work, and the others will not.

I don't know of a Java library that has those streams. Doesn't mean it doesn't exist. Just that I don't know of one.

DotNetZip is free and works on Windows+Mono, Linux+Mono, as well as Windows+.NET.

Cheeso
+1  A: 

It would help enormously if you divulged:

(0) What leads you to the conclusion that "the contents of the packet seem to be compressed"

(1) The URLs for the (a) source and (b) documentation of the package that is writing the packets

(2) The contents of a sample packet

(a) print repr(open('file_saved_from_wireshark', 'rb').read())

(b) just in case the long trip around via wireshark is muddying the water, insert this in your Python client:

print repr(a_sample_packet)

(3) the exact error message that you got (copy/paste)

Update after OP supplied the hex dump of a packet

This code:

import binascii, sys, cStringIO, gzip, struct, zlib
# guff is allegedly a "packet", formatted as 2 hex characters per byte
guff = "001321cdc68ff4ce46e4f00d0800450000832a85400080061e51ac102cceac102cb004f8092a9909b32c10e81cb25018f734823e00000100000000000000521f8b08000000000000005bf39681b59c85818121a0b4884138da272bb12c512f27312f5dcf3f292b35b9c47ac2b988f902c59a394c0c0c150540758c250c5c2ea5b9b9950a2e89258900aa4c201a3f000000"
guff2 = binascii.unhexlify(guff)
print "raw input: len=%d repr=%r" % (len(guff2), guff2)
# gzip spec: http://www.faqs.org/rfcs/rfc1952.html
GZIP_HDR = "\x1F\x8B\x08"
gzpos = guff2.find(GZIP_HDR)
if gzpos == -1:
    print "Can't find gzip header"
    sys.exit(1)
print gzpos, "bytes before gzipped data"
gzipped = guff2[gzpos:]
packet_crc, packet_orig_len = struct.unpack("<II", gzipped[-8:])
print "packet_crc, packet_orig_len:", hex(packet_crc), packet_orig_len
fobj = cStringIO.StringIO(gzipped)
zf = gzip.GzipFile(fileobj=fobj)
payload = zf.read()
print "payload: len=%d repr=%r" % (len(payload), payload)
print "crc32(payload):", hex(zlib.crc32(payload))

produced this output (wrapped at col 80 by Windows' "Command Prompt" terminal) when run with Python 2.6.4:

raw input: len=145 repr="\x00\x13!\xcd\xc6\x8f\xf4\xceF\xe4\xf0\r\x08\x00E\x00\x
00\x83*\x85@\x00\x80\x06\x1eQ\xac\x10,\xce\xac\x10,\xb0\x04\xf8\t*\x99\t\xb3,\x1
0\xe8\x1c\xb2P\x18\xf74\x82>\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00R\x1f\x8b\x0
8\x00\x00\x00\x00\x00\x00\x00[\xf3\x96\x81\xb5\x9c\x85\x81\x81!\xa0\xb4\x88A8\xd
a'+\xb1,Q/'1/]\xcf?)+5\xb9\xc4z\xc2\xb9\x88\xf9\x02\xc5\x9a9L\x0c\x0c\x15\x05@u\
x8c%\x0c\\.\xa5\xb9\xb9\x95\n.\x89%\x89\x00\xaaL \x1a?\x00\x00\x00"
63 bytes before gzipped data
packet_crc, packet_orig_len: 0x1a204caa 63
payload: len=63 repr='\xac\xed\x00\x05w\x04\x00\x00\x00Pur\x00\x13[Ljava.lang.Ob
ject;\x90\xceX\x9f\x10s)l\x02\x00\x00xp\x00\x00\x00\x01t\x00\nDummy Data'
crc32(payload): 0x1a204caa

Comments/questions:

  1. This packet is 145 bytes long; what happened to the idea that a packet was about 2900 bytes?

  2. The packet is 63 bytes of as-yet-unanalysed data followed by an 82-byte gzip stream which decompresses(!) to 63 bytes. There is no data after the gzip stream -- verified by comparing the last 8 bytes of the packet with calculated gzip values. It contains the expected "Dummy Data", but userid "johnathonb" is not there (or obfuscated or encrypted).

  3. The packet structure doesn't match the code that we guessed was being used (no XML, no base64).

  4. The gunzipped data contains the string "java.lang.Object" which is probably symptomatic of some java serialisation protocol. Lasciate ogni speranza, voi qu'entrate.

John Machin
0) The source code in the original Java application has a test for whether the packet is compressed or not. I knew the contents of the packet in question (I had sent a message via the chat function of the program), and could not find said content in the data portion of the packet.1) The program is Megamek, it's available @ sourceforge (Not sure how to post links in comments). I wanted to take a closer look at how the client worked and though experimenting with rewriting some of it might be one way to go about it (since I'm not real good at Java).I'll get the answers for 2 and 3 shortly
Jonathanb
Here is the error messageTraceback (most recent call last): File "<pyshell#2>", line 1, in <module> file_contents = f.read() File "C:\Python26\lib\gzip.py", line 212, in read self._read(readsize) File "C:\Python26\lib\gzip.py", line 255, in _read self._read_gzip_header() File "C:\Python26\lib\gzip.py", line 156, in _read_gzip_header raise IOError, 'Not a gzipped file'IOError: Not a gzipped file
Jonathanb
Sorry, I have no clue how to get line breaks in there. I've tried everything I can think of. Also, I have the data portion of the packet, saved as a hex stream as a file, but it is 2094 characters. How can I attach a file?
Jonathanb
Error message: the general idea is to edit your question so that (1) you can format stuff properly (2) your problem description is in one place. The file: I don't understand "saved as a hex stream as a file" -- which of my 2a or 2b suggestions is that meant to represent? 2094 "characters" is rather larger than expected. How many bytes were in the packet? 2094/2? 2094/3? something else? I suggest that you edit your question to include (a) the first 200 bytes-worth of the "hex stream" (b) the known contents that you sent the app via its chat function.
John Machin
Thank you for your assistance. In response to your questions/observations. 1) I generated a smaller packet so it would be easier to work with. The 2900 byte packets are what happens when the game communicates about the different units in the game. 2) Not entirely sure what to make your observation here. My suspicion is the 63 bytes is the TCP Header? I'm not up to snuff on that though so I could be wrong. 3) Yeah, that fact makes writing a pure Python client to this thing more difficult. 4)Guess I'll have to learn Jython to write this. It's my understanding Jython can manipulate Java objects.
Jonathanb