views:

43

answers:

1

I'm working on a project where a Windows web server running PHP is communicating over a very slow connection with a back end Linux server running an application written in C++. Because the connection between the two machines is so slow, I'd like to compress the traffic moving between them.

I've gotten to where I can compress a string, save it to a file, read the file, and uncompress the string in C++ using Zlib, and likewise in PHP. However, if I try to compress a string in one language and decompress it in the other (as will be happening in the real world), I get errors griping that the compressed data is corrupted. I've also noticed that the same string compressed in C++ results in a different file than in PHP, which leads me to believe that Zlib is using a different compression algorithm on each language.

I'm using default settings on both sides. The C++ I'm using to do the compression and decompression is

compress((Bytef*)compressed, (uLongf*)&compressedLength, (Bytef*)uncompressed, (uLong)uncomressedLength);
uncompress((Bytef*)uncompressed, (uLongf*)&uncomressedLength, (Bytef*)compressed, (uLong)compressedLength);

while the PHP code is

$compressed = gzcompress($uncompressed);
$uncompressed = gzuncompress($compressed);

Why are these resulting in different compressed strings? Is that what's causing the problems with decompression? What should I be doing to get this to work? Also, I'm not committed to Zlib. Zlib's what my initial research uncovered, but if there's a better way to do this, I'm all ears.

Edit: Actually, after doing a little more testing, it appears that C++ was working with my initial test case, but not universally. I tried it with the input "hellohellohello", and on decompression, it reported a Z_DATA_ERROR and decompressed it to just "hello". I guess that means I'm doing something wrong on the C++ side, which may explain why PHP is unhappy decompressing C++ compressed strings.

Edit 2: I tried out the zpipe.c sample program, and it correctly uncompresses strings compressed by PHP and produces compressed strings PHP can uncompress. Clearly, the problem(s) exist in my C++ code. Either my usage of compress and uncompress is incorrect, or I'm reading and writing the file incorrectly. Neither the compress or decompress programs interact correctly with zpipe.

Update: I've now gotten to where I can compress a string using PHP and read it with either PHP or C++, and I can compress a string with C++ and read it with C++, but attempting to read it with PHP results in PHP Warning: gzuncompress(): data error. What could be different that would cause this combination of working/not working scenarios?

+1  A: 

Zlib's default compression level is 6 - you could try passing that as the second param on gzcompress for PHP.

string gzcompress ( string $data [, int $level = -1 ] )

From the ZLIB manual:

The compression level must be Z_DEFAULT_COMPRESSION, or between 0 and 9: 1 gives best speed, 9 gives best compression, 0 gives no compression at all (the input data is simply copied a block at a time). Z_DEFAULT_COMPRESSION requests a default compromise between speed and compression (currently equivalent to level 6)

Steve Townsend
You may also want to make sure the other settings, and maybe the library versions, are the same or compatible.
peachykeen
Changing the compression level doesn't seem to have an impact on the PHP side. It works regardless of the level I set. The incorrect decompression from the C++ side is different if I vary the compression level, but for at least 0, 6, and 9, it's always wrong.
Warren Pena
Weird. I had to get ZLIB working with C++, C# and Java, and though files were different, uncompress always works. Is there some stream or file flush operation you are not doing on either end?
Steve Townsend
When I compress using PHP, the last line of the script is an fclose, which should flush the stream out to the file. I'm reading it in on the C++ side using fread(compressed, sizeof(compressed), 1, in);, and compressed is many times larger than the size of the file, so it should be reading in the whole thing. In fopen on both sides, I'm using b in the mode string to indicate that it's a binary file, but its use (or lack thereof) doesn't seem to make a difference.
Warren Pena
You could try compress/decompress on the C++ side and the same on the PHP side to rule out writer/reader issues on either end
Steve Townsend