Parallelizeable jpeg like compression using only DCT, run length encoding stages, what sort of compression/performance possible?

views:

130

answers:

+3 Q:

Parallelizeable jpeg like compression using only DCT, run length encoding stages, what sort of compression/performance possible?

We have to compress a ton o' (monochrome) image data and move it quickly. If one were to just use the parallelizeable stages of jpeg compression (DCT and run length encoding of the quantized results) and run it on a GPU so each block is compressed in parallel I am hoping that would be very fast and still yeild a very significant compression factor like full jpeg does.

Does anyone with more GPU / image compression experience have any idea how this would compare both compression and performance wise over using libjpeg on a CPU? (If it is a stupid idea, feel free to say so - I am extremely novice in my knowledge of cuda and the various stages of jpeg compression.) Certainly it will be less compression and hopefully(?) faster but I have no idea how significant those factors may be.

You could hardly get more compression in GPU - there are just no complex-enough algorithms which can use that MUCH power.

When working with simple alos like JPEG - it's so simple that you'll spend most of the time transferring data via PCI-E bus (which has significant latency, especially when card does not support DMA transfers).

Positive side is that if card have DMA, you can free up CPU for more important stuff, and get image compression "for free".

In the best case, you can get about 10x improvement on top-end GPU compared to top-end CPU provided that both CPU & GPU code is well-optimized.

BarsMonster 2010-09-28 15:33:41

How much latency do you end up with on PCI-E without DMA?

John Robertson 2010-09-28 17:42:35

BarsMonster 2010-09-28 20:43:26

ansaurus

tags:

views:

answers:

Parallelizeable jpeg like compression using only DCT, run length encoding stages, what sort of compression/performance possible?

related questions