views:

214

answers:

6

I have an embedded application where an image scanner sends out a stream of 16-bit pixels that are later assembled to a grayscale image. As I need to both save this data locally and forward it to a network interface, I'd like to compress the data stream to reduce the required storage space and network bandwidth.

Is there a simple algorithm that I can use to losslessly compress the pixel data?

I first thought of computing the difference between two consecutive pixels and then encoding this difference with a Huffman code. Unfortunately, the pixels are unsigned 16-bit quantities so the difference can be anywhere in the range -65535 .. +65535 which leads to potentially huge codeword lengths. If a few really long codewords occur in a row, I'll run into buffer overflow problems.

Update: my platform is an FPGA

+3  A: 

How many resources do you have available on your embedded platform?

Could you port zlib and do gzip compression? Even with limited resources, you should be able to port something like LZ77 or LZ88.

tomlogic
Thanks for your suggestions. Dictionary-based methods like the ones you suggested seem to be more suited to the task than huffman coding. Concerning the resources, I want to implement this on an FPGA. I don't have a processor.
geschema
+7  A: 

PNG provides free, open-source, lossless image compression in a standard format using standard tools. PNG uses zlib as part of its compression. There is also a libpng. Unless your platform is very unusual, it should not be hard to port this code to it.

Norman Ramsey
wouldn't JPG be more suitable for grey scale scans?
Matthew Lock
@Matthew, the title says "lossless". There's a non-standard lossless mode for JPEG, but I think the better choice would be PNG.
Mark Ransom
Also, use an appropriate PNG filter, because LZ / Huffman do not exploit 2-D predictive compression. http://www.w3.org/TR/PNG-Filters.html Also, allow the device end-user to choose a reduced number of bits per sample, because the least-significant bits are not really compressible and will increase the data size significantly. On the hardware / protocol level, simply zeroing out the unneeded bits at the source will improve compressibility.
rwong
+3  A: 

There are a wide variety of image compression libraries available. For example, this page lists nothing but libraries/toolkits for PNG images. Which format/library works best for you will most likely depend on the particular resource constraints you are working under (in particular, whether or not your embedded system can do floating-point arithmetic).

bta
+2  A: 

The goal with lossless compression is to be able to predict the next pixel based on previous pixels, and then to encode the difference between your prediction and the real value of the pixel. This is what you initial thought to do, but you were only using the one previous pixel and making the prediction that the next pixel would be the same.

Keep in mind that if you have all of the previous pixels, you have more relevant information than just the preceding pixel. That is, if you are trying to predict the value of X, you should use the O pixels:

..OOO...
..OX

Also, you would not want to use the previous pixel, B, in the stream to predict X in the following situation:

OO...B <-- End of row
X <- Start of next row

Instead you would make your prediction base on the Os.

Dan Hook
+1  A: 

How 'lossless' do you need?
If this is a real scanner there is a limit to the bandwidth/resolution so even if it can send +/-64K values it may be unphysical for adjacent pixels to have a difference of more than say 8 bits.

In which case you can do a start pixel value for each row and then do differences between each pixel.

This will smear out peaks but it may be that any peaks more than 'N'bits are noise anyway.

Martin Beckett
That's an interesting observation, but unfortunately I know too little about the image generation physics (this is for electron microscopy) to make assumptions about the difference between adjacent pixels.
geschema
A: 

A good LZ77/RLE hybrid with bells and wwhistles can get wonderful compression that is fairly quick to decompress. They will also be bigger, badder compressors on smaller files due to the lack of library overhead. For a good, but GPLd implentation of this, check out PUCrunch

Michael Dorgan
PUCrunch is under LGPL.
Craig McQueen