views:

98

answers:

4

I am interfacing an embedded device with a camera module that returns a single jpeg compressed frame each time I trigger it.

I would like to take three successive shots (approx 1 frame per 1/4 second) and further compress the images into a single file. The assumption here is that there is a lot of temporal redundancy, therefore lots of room for more compression across the three frames (compared to sending three separate jpeg images).

I will be implementing the solution on an embedded device in C without any libraries and no OS.

The camera will be taking pics in an area with very little movement (no visitors or screens in the background, maybe a tree with swaying branches), so I think my assumption about redundancy is pretty solid.

When the file is finally viewed on a pc/mac, I don't mind having to write something to extract the three frames (so it can be a nonstandard cluge)

So I guess the actual question is: What is the best way to compress these three images together given the fact that they are already in JPEG format (it is a possibly to convert back to a raw image, but if i dont have too...)

A: 

While I've not studied signals since uni, I think you're looking for a lossless video codec.

Huffyuv is one that's been around, and has source code available. The basic concept is to predict the pixel changes between each frame, and encode (and compress) the difference between predicted and actual changes.

Lagarith is another open source codec.

You'll need to feed the decoded JPEG frames into each of these codecs.

Jeff Meatball Yang
If it's jpeg, it's already lossy, isn't it?
Bill K
hmm, both huffyuv and lagarith are lossless (far too big), and requires me to decode the jpeg files.
michael
A: 

If I were you, I'd use your system to take three pictures by hand right now so you can check your assumptions before going much further.

My guess is that you will need a slight translation even if you don't intend any movement. Vibration of equipment, wind and even heat expansion might be enough to throw you off by a pixel or two which would ruin a straight pixel-to-pixle compression.

Other factors could be light change due to a cloud passing across the sun or heat magnification wafting off the ground or even jpeg compression artifacts.

I'm not saying it's not going to work, just that I'd run one by hand first.

Storage is so cheap, you are going to get much more bang for the buck by adding a larger sim card (or whatever) to your camera.

Bill K
I'd assumed this is something like a fixed security camera, and images must be compressed for GSM uplink or some expensive transfer mechanism... Maybe I'm wrong?
Roddy
I agree that there will be some variation in the pictures no matter what. therefore the jpeg files will be of slightly different sizes, which means doing a straight delta of the files is off (unless i implement some delta calc func that can take arrays of variable sizes). I was hoping that someone had a suggestion for taking advantage of the redundancy across the pictures rather than providing a solution that required reverting the images back to raw bmp and then recompressing with something else.
michael
Ya the link is expensive (in power mostly). But the cost of power to drive the embedded system is relatively small compared to the transmit power used by the radio.
michael
+1  A: 

Any decode/modify/recode on the JPEG images may lower the image quality, but as your camera can only capture JPEGs, I'm guessing ultimate image quality is unlikely to be a key requirement...

I can't think of an easy way you can do this in the JPEG frequency domain, but you can decompress then SUBTRACT images 2 and 3 from image 1 to get delta images. These should compress a lot better, and would be added back to image #1 by the receiver.

It turns out there are some operations you can do in the compressed domain that might help. You'd need to uncompress the Huffman/RLE stages of the jpeg, and then work on the DCT coefficients directly. You could well be able to do image subtraction this way, and it should not introduce further artefacts.

Roddy
Probably best to take the largest of the three images as the key frame and delta the other two off it...
michael
@michael, yes, understood. See the extra link I added to the answer as well...
Roddy
Maybe the space advantage of storing related images as deltas will be offset by increased visibility of noise artifacts in the recombined images. This would be due to how jpeg compression algorithms usually judge noise with perceptual measures (as opposed to simple numeric deviation) so the noise introduced to the deltas will ~defy the perceptual models when recombined. Patterns which can be almost invisible in a grey delta can jump out when added to colours in the base image. Its also neccessary to half the deltas contrast to cope with possible overflows -increasing the artifacts power.
strainer
+1  A: 

I'm adding this as a second answer because it's VERY different from my first now that I better understand your problem.

I find it HIGHLY unlikely that you will be able to work with the jpeg files directly. In compressed files, a small change tends to be propagated across a large portion of the file, causing the two files to fail to compare in many places.

I have two suggestions.

1: zip the images up. Seems too simple, you probably already thought of it, but the zip protocol is well known and freely available and will automatically take advantage of any similarities it can. Again just get a camera take three pictures; zip them up and see how it goes.

2: A little more complex but you could uncompress the three jpegs into bmps, concatenate the bmps (line them up one after the other) then re-compress to a jpeg. The jpeg protocol should take full advantage of the similarities in the three images and the work is pretty minimal from your point of view.

Bill K
Exactly what I was thinking on all three counts (propagation of small changes through compressed file, and the two solutions).I am going to try example one (Do zip programs take advantage of cross file similarities when zipping multiple files?). And if its good enough (dunno what that means yet), thats what I will do, otherwise i will try decoding/recoding (if i can squeeze the operation into 10K of ram :P )
michael
You need a good understanding of JPEG compression, but it's possible to work on partially decompressed images. JPEG does a lossy DCT encode on each 16x16 pixel block, then lossless RLE encode followed by Huffman encode of the result. You can manipulate the lossy encoded data to some extent but you need to undo/redo the 'lossless' parts of the algorithm.
Roddy