views:

393

answers:

2

Hello!

I am now trying to understand how JPEG encoding works and everything seems fine except the color transformation part.

Before attempting to do a DCT in JPEG algorithm, the image is transformed into YCbCr color space. To me this essentially means that we just (comparing to initial RGB image) take a chunk of color information and dispose it while applying the RGB -> YCbCr transformation.

So, our encoding steps look generally like RGB -> YCbCr -> DCT -> Huffman. The decoding means inversing this process.

And my question is - why does the image (for example, created and exported to JPEG) remain the same in terms of color, although we have to make inverse YCbCr -> RGB transform. Where does the disposed part of color information comes from or how is it handled?

+1  A: 

RGB to YCbCr is a determinate, invertible, mathematical transformation. Therefore there is no "disposed" part.

Put another way - an RGB pixel has the same information content as a YCbCr pixel in the same way as "A" and "01000001" are alternate representations of the same information just with a different coding scheme.

A clarification: It is very common for chroma downsampling to be done between the YCbCr -> DCT transformation in which case information will be lost, but depending on the algorithm used (quality setting) the downsampling step may be "none".

msw
Thank you very much.
HardCoder1986
There is also the quantization stage between the color transform and DCT, which can be lossless, given a DQT segment of all 1's
matja
+2  A: 

To me this essentially means that we just (comparing to initial RGB image) take a chunk of color information and dispose it while applying the RGB -> YCbCr transformation.

No information gets disposed by the transformation itself. The transformation is reversible in a mathematical sense. E.g. if you convert a color to YCbCr and transform the result back to RGB you get the same color back. In a perfect world after all.

In practice there is a loss of information. Assume that you start with three bytes in RGB. If you convert to YCbCr you get three values of which two, namely Cb and Cr don't fit into 8 bit anymore. Speaking technically the two representations RGB and YUV have a different gamut (http://en.wikipedia.org/wiki/Gamut)

This information loss is fortunately rarely visible. Important side-node: This gamut thing is an unwanted side-effect and has nothing to do with the choice of using YCbCr at the first place.

The point of using YCbCr is, that the data stored in Y is the most important. It is the brightness, or the gray-scale value. The data in Cb and Cr are the color information with brightness subtracted so to say.

Now our eyes aren't that good at picking subtle differences in color, but they are sensitive to shades of intensity. To make use of this in jpeg only a low resolution image of Cb and Cr are stored and Y is stored at full resolution. There are different ways to do this with the most common one to leave out every other pixel from Cb and Cr in x and y. That reduces the space requirements by a factor of four for Cb and Cr.

Where does the disposed part of color information comes from or how is it handled

It does not magically come back. The information is lost forever. However, since the information wasn't that important to begin with we don't see much artifacts.

In jpeg, the left out pixels of Cb and Cr panes are approximated by upscaling the Cb and Cr plane again. Some decoders just replicate the missing pixels by picking a neigbour, other do linear interpolation.

Nils Pipenbrinck