views:

176

answers:

6

I need a FAST decompression routine optimized for restricted resource environment like embedded systems on binary (hex data) that has following characteristics:

  1. Data is 8bit (byte) oriented (data bus is 8 bits wide).
  2. Byte values do NOT range uniformly from 0 - 0xFF, but have a poisson distribution (bell curve) in each DataSet.
  3. Dataset is fixed in advanced (to be burnt into Flash) and each set is rarely > 1 - 2MB

Compression can take as much as time required, but decompression of a byte should take 23uS in the worst case scenario with minimal memory footprint as it will be done on a restricted resource environment like an embedded system (3Mhz - 12Mhz core, 2k byte RAM).

What would be a good decompression routine?

The basic Run-length encoding seems too wasteful - I can immediately see that adding a header setion to the compressed data to put to use unused byte values to represent oft repeated patterns would give phenomenal performance!

With me who only invested a few minutes, surely there must already exist much better algorithms from people who love this stuff?

I would like to have some "ready to go" examples to try out on a PC so that I can compare the performance vis-a-vis a basic RLE.

+2  A: 

Well, the main two algorithms that come to mind are Huffman and LZ.

The first basically just creates a dictionary. If you restrict the dictionary's size sufficiently, it should be pretty fast...but don't expect very good compression.

The latter works by adding back-references to repeating portions of output file. This probably would take very little memory to run, except that you would need to either use file i/o to read the back-references or store a chunk of the recently read data in RAM.

I suspect LZ is your best option, if the repeated sections tend to be close to one another. Huffman works by having a dictionary of often repeated elements, as you mentioned.

Brian
Either one might be viable if PoorLuzer had more RAM and CPU cycles to work with, but if the constraints are really as stated, I think he's asking the impossible.
Jim Lewis
I made some mistakes in "hearing", from a friend who will do the actual programming, the values that I posted last time and have updated the post to reflect more accurate values. This kind of stuff is new to me as I have always programmed for an OS.
PoorLuzer
+3  A: 

If you have a preset distribution of values that means the propability of each value is fixed over all datasets, you can create a huffman encoding with fixed codes (the code tree has not to be embedded into the data).

Depending on the data, I'd try huffman with fixed codes or lz77 (see links of Brian).

Tobias Langner
A: 

You should try different compression algorithms with either a compression software tool with command line switches or a compression library where you can try out different algorithms. Use typical data for your application. Then you know which algorithm is best-fitting for your needs.

codymanix
Yes, but the problem is most software compression tools are geared towards a PC rather than an embedded system at mind and thus have different assumptions regarding the amount of resources available.
PoorLuzer
+2  A: 

Since this seems to be audio, I'd look at either differential PCM or ADPCM, or something similar, which will reduce it to 4 bits/sample without much loss in quality.

With the most basic differential PCM implementation, you just store a 4 bit signed difference between the current sample and an accumulator, and add that difference to the accumulator and move to the next sample. If the difference it outside of [-8,7], you have to clamp the value and it may take several samples for the accumulator to catch up. Decoding is very fast using almost no memory, just adding each value to the accumulator and outputting the accumulator as the next sample.

A small improvement over basic DPCM to help the accumulator catch up faster when the signal gets louder and higher pitch is to use a lookup table to decode the 4 bit values to a larger non-linear range, where they're still 1 apart near zero, but increase at larger increments toward the limits. And/or you could reserve one of the values to toggle a multiplier. Deciding when to use it up to the encoder. With these improvements, you can either achieve better quality or get away with 3 bits per sample instead of 4.

If your device has a non-linear μ-law or A-law ADC, you can get quality comparable to 11-12 bit with 8 bit samples. Or you can probably do it yourself in your decoder. http://en.wikipedia.org/wiki/M-law%5Falgorithm

There might be inexpensive chips out there that already do all this for you, depending on what you're making. I haven't looked into any.

David
This is not for audio :-) It's actually to send out a predefined manchester encoded dataset using a classic UART by "bit banging" each bit from a byte. this process goes on unit no bytes are left.
PoorLuzer
I had assumed because of the normal distribution and because 23uS matches up with 44.1khz.
David
+1  A: 

I have used zlib in embedded systems for a bootloader that decompresses the application image to RAM on start-up. The licence is nicely permissive, no GPL nonsense. It does make a single malloc call, but in my case I simply replaced this with a stub that returned a pointer to a static block, and a corresponding free() stub. I did this by monitoring its memory allocation usage to get the size right. If your system can support dynamic memory allocation, then it is much simpler.

http://www.zlib.net/

Clifford
+4  A: 

The two solutions I use when performance is the only concern:

  • LZO Has a GPL License.
  • liblzf Has a BSD License.
  • miniLZO.tar.gz This is LZO, just repacked in to a 'minified' version that is better suited to embedded development.

Both are extremely fast when decompressing. I've found that LZO will create slightly smaller compressed data than liblzf in most cases. You'll need to do your own benchmarks for speeds, but I consider them to be "essentially equal". Both are light-years faster than zlib, though neither compresses as well (as you would expect).

LZO, in particular miniLZO, and liblzf are both excellent for embedded targets.

johne