views:

32

answers:

1

I'm curious about whether there are approaches or algorithms one might use to downscale an image based on the amount of detail or entropy in the image such that the new size is determined to be a resolution at which most of the detail of the original image would be preserved.

For example, if one takes an out-of-focus or shaky image with a camera, there would be less detail or high frequency content than if the camera had taken the image in focus or from a fixed position relative to the scene being depicted. The size of the lower entropy image could be reduced significantly and still maintain most of the detail if one were to scale this image back up to the original size. However, in the case of the more detailed image, one wouldn't be able to reduce the image size as much without losing significant detail.

I certainly understand that many lossy image formats including JPEG do something similar in the sense that the amount of data needed to store an image of a given resolution is proportional to the entropy of the image data, but I'm curious, mostly for my own interest, if there might be a computationally efficient approach for scaling resolution to image content.

+2  A: 

It's possible, and one could argue that most lossy image compression schemes from JPEG-style DCT stuff to fractal compression are essentially doing this in their own particular ways.

Note that such methods almost always operate on small image chunks rather than the big picture, so as to maximise compression in lower detail regions rather than being constrained to apply the same settings everywhere. The latter would likely make for poor compression and/or high loss on most "real" images, which commonly contain a mixture of detail levels, though there are exceptions like your out-of-focus example.

You would need to define what constitutes "most of the detail of the original image", since perfect recovery would only be possible for fairly contrived images. And you would also need to specify the exact form of rescaling to be used each way, since that would have a rather dramatic impact on the quality of the recovery. Eg, simple pixel repetition would better preserve hard edges but ruin smooth gradients, while linear interpolation should reproduce gradients better but could wreak havoc with edges.

A simplistic, off-the-cuff approach might be to calculate a 2D power spectrum and choose a scaling (perhaps different vertically and horizontally) that preserves the frequencies that contain "most" of the content. Basically this would be equivalent to choosing a low-pass filter that keeps "most" of the detail. Whether such an approach counts as "computationally efficient" may be a moot point...

walkytalky
Thanks for the analysis. There's complexity both in downscaling and upscaling as far as algorithms are concerned. I suppose bicubic might be a decent general choice for downscaling? For upscaling I was figuring that one would likely use some sort of interpolation, perhaps bicubic?
James Snyder
As for the off-the-cuff approach, what you're suggesting is similar to what I was thinking about after I had made the post. It would be like making a down-sampling low-pass filter where the filter coefficients/resampling resolution would be based on looking at where power drops off for lower frequencies.(ack, keep pressing enter to make a newline and end up saving the comment)
James Snyder
I expect that savings for this sort of thing might not improve upon data size one gets with lossy compression, however I think something like this could be used for automatic scaling, or to provide a visual to a user when they're scaling an image as to when they will start losing detail.(sorry, kept hitting enter and making an additional comment, then wasn't able to edit previous ones further)
James Snyder