views:

143

answers:

1

We have a database encoded with Huffman coding. The aim here is to copy on the GPU it with its associated decoder; then on the GPU, decod the database and do stuff on this decoded database without copying back it on the CPU.

I am far to be a Huffman specialist, but the few I know shows that it seems to be an algorithm essentially based on control structures. With the basic algorithm, I am afraid that there will be a lot of serialized operations.

My 2 questions are:

  • do you know if there exists any efficient GPU version for Huffman coding
  • if not, do you think there exists a Huffman algorithm which be adapted on GPU (ie. with less control structures). Or maybe you know (and you could provide a reference) that efficient Huffman decoding can not be efficient on GPU.

I see other constraints, but they are not critical: - GPU could not be very efficient to handle tree: binary tree can be stored in a classical array - workload could be difficult to balance: we'll see after

+3  A: 

The problem with Huffman coding is that you can't fast-forward. ie: you have to decode bit by bit, linearly.

As such it's not ideal for parallelism.

If you can decide on the encoding, you could perfectly encode chunk by chunk so as to be able to decode each chunk independently.

Matthieu M.
Why do you think bit by bit is not ideal for parallelism? I think read several independant encoded value bit by bit is not a problem. The problem is to perform in parallel the decoding on these bits.
Jérôme
The problem for Huffman is that you don't know how many bits a symbol is encoded in. You read the first, check if it's a symbol, read the second, check if it's a symbol, read the third AH it's a symbol, okay so I store the symbol and rewind my state machine. Going on. That is not parallelizable.
Matthieu M.