tags:

views:

321

answers:

3

Hi there.

I'm trying to run an integer-to-integer lifting 5/3 on an image of lena. I've been following the paper at http://www.uwec.edu/walkerjs/media/research_signpost_article.pdf

I'm running into issues though. The image just doesn't seem to come out quite right. I appear to be overflowing slightly in the green and blue channels which means that subsequent passes of the wavelet function find high frequencies where there ought not to be any. I'm also pretty sure I'm getting something else wrong as I am seeing a line of the s0 image at the edges of the high frequency parts.

My function is as follows:

bool PerformHorizontal( Col24* pPixelsIn, Col24* pPixelsOut, int width, int pixelPitch, int height )
{
    const int widthDiv2 = width / 2;
    int y   = 0;
    while( y < height )
    {
        int x = 0;
        while( x < width )
        {
            const int n     = (x)       + (y * pixelPitch);
            const int n2    = (x / 2)   + (y * pixelPitch);

            const int s     = n2;
            const int d     = n2 + widthDiv2;

            // Non-lifting 5 / 3
            /*pPixelsOut[n2 + widthDiv2].r  = pPixelsIn[n + 2].r - ((pPixelsIn[n + 1].r + pPixelsIn[n + 3].r) / 2) + 128;
            pPixelsOut[n2].r                = ((4 * pPixelsIn[n + 2].r) + (2 * pPixelsIn[n + 2].r) + (2 * (pPixelsIn[n + 1].r + pPixelsIn[n + 3].r)) - (pPixelsIn[n + 0].r + pPixelsIn[n + 4].r)) / 8;

            pPixelsOut[n2   + widthDiv2].g  = pPixelsIn[n + 2].g - ((pPixelsIn[n + 1].g + pPixelsIn[n + 3].g) / 2) + 128;
            pPixelsOut[n2].g                = ((4 * pPixelsIn[n + 2].g) + (2 * pPixelsIn[n + 2].g) + (2 * (pPixelsIn[n + 1].g + pPixelsIn[n + 3].g)) - (pPixelsIn[n + 0].g + pPixelsIn[n + 4].g)) / 8;

            pPixelsOut[n2   + widthDiv2].b  = pPixelsIn[n + 2].b - ((pPixelsIn[n + 1].b + pPixelsIn[n + 3].b) / 2) + 128;
            pPixelsOut[n2].b                = ((4 * pPixelsIn[n + 2].b) + (2 * pPixelsIn[n + 2].b) + (2 * (pPixelsIn[n + 1].b + pPixelsIn[n + 3].b)) - (pPixelsIn[n + 0].b + pPixelsIn[n + 4].b)) / 8;*/

            pPixelsOut[d].r = pPixelsIn[n + 1].r    - (((pPixelsIn[n].r         + pPixelsIn[n + 2].r)   >> 1) + 127);
            pPixelsOut[s].r = pPixelsIn[n].r        + (((pPixelsOut[d - 1].r    + pPixelsOut[d].r)      >> 2) - 64);

            pPixelsOut[d].g = pPixelsIn[n + 1].g    - (((pPixelsIn[n].g         + pPixelsIn[n + 2].g)   >> 1) + 127);
            pPixelsOut[s].g = pPixelsIn[n].g        + (((pPixelsOut[d - 1].g    + pPixelsOut[d].g)      >> 2) - 64);

            pPixelsOut[d].b = pPixelsIn[n + 1].b    - (((pPixelsIn[n].b         + pPixelsIn[n + 2].b)   >> 1) + 127);
            pPixelsOut[s].b = pPixelsIn[n].b        + (((pPixelsOut[d - 1].b    + pPixelsOut[d].b)      >> 2) - 64);

            x += 2;
        }
        y++;
    }
    return true;
}

There is definitely something wrong but I just can't figure it out. Can anyone with slightly more brain than me point out where I am going wrong? Its worth noting that you can see the un-lifted version of the Daub 5/3 above the working code and this, too, give me the same artifacts ... I'm very confused as I have had this working once before (It was over 2 years ago and I no longer have that code).

Any help would be much appreciated :)

Edit: I appear to have eliminated my overflow issues by clamping the low pass pixels to the 0 to 255 range. I'm slightly concerned this isn't the right solution though. Can anyone comment on this?

A: 

I'm assuming the data have already been thresholded?

I also don't get why you're adding back in +127 and -64.

John at CashCommons
The data is a standard 24-bit PNG. I add the 128 because thats what the paper says. Mind it also says I should be adding 128 andnot subtracting 64 ... but I get better results subtracting 64 and thats something I remember from last time i implemented this.
Goz
+1  A: 

You can do some tests with extreme values to see the possibility of overflow. Example:

  pPixelsOut[d].r = pPixelsIn[n + 1].r - (((pPixelsIn[n].r  + pPixelsIn[n + 2].r) >> 1) + 127);

If:
  pPixelsIn[n  ].r == 255
  pPixelsIn[n+1].r == 0
  pPixelsIn[n+2].r == 255

Then:
  pPixelsOut[d].r == -382


But if:
  pPixelsIn[n  ].r == 0
  pPixelsIn[n+1].r == 255
  pPixelsIn[n+2].r == 0

Then:
  pPixelsOut[d].r == 128

You have a range of 511 possible values (-382 .. 128), so, in order to avoid overflow or clamping, you would need one extra bit, some quantization, or another encoding type!

e.tadeu
I came to the same conclusion myself about 30 mins before you posted :). I'm going to take a look at doing it as floats and then see if I can convert it back to an integer-to-integer. Its definitely not going to be lossless. That said, though, the low pass would, logically only produce even numbers which would mean I only need 7 bits to encode s1 which means I ought to be able to hide the sign bit in the bottom bit ... Still I think I'll accept lossy because once I've got this working the final comrpession will definitely be lossy :)
Goz
A: 

OK I can losslessly forward then inverse as long as I store my post forward transform data in a short. Obviously this takes up a little more space than I was hoping for but this does allow me a good starting point for going into the various compression algorithms. You can also, nicely, compress 2 4 component pixels at a time using SSE2 instructions. This is the standard C forward transform I came up with:

        const int16_t dr    = (int16_t)pPixelsIn[n + 1].r   - ((((int16_t)pPixelsIn[n].r        + (int16_t)pPixelsIn[n + 2].r)  >> 1));
        const int16_t sr    = (int16_t)pPixelsIn[n].r       + ((((int16_t)pPixelsOut[d - 1].r   + dr)                           >> 2));

        const int16_t dg    = (int16_t)pPixelsIn[n + 1].g   - ((((int16_t)pPixelsIn[n].g        + (int16_t)pPixelsIn[n + 2].g)  >> 1));
        const int16_t sg    = (int16_t)pPixelsIn[n].g       + ((((int16_t)pPixelsOut[d - 1].g   + dg)                           >> 2));

        const int16_t db    = (int16_t)pPixelsIn[n + 1].b   - ((((int16_t)pPixelsIn[n].b        + (int16_t)pPixelsIn[n + 2].b)  >> 1));
        const int16_t sb    = (int16_t)pPixelsIn[n].b       + ((((int16_t)pPixelsOut[d - 1].b   + db)                           >> 2));

        pPixelsOut[d].r = dr;
        pPixelsOut[s].r = sr;

        pPixelsOut[d].g = dg;
        pPixelsOut[s].g = sg;

        pPixelsOut[d].b = db;
        pPixelsOut[s].b = sb;

It is trivial to create the inverse of this (A VERY simple bit of algebra). Its worth noting, btw, that you need to inverse the image from right to left bottom to top. I'll next see if I can shunt this data into uint8_ts and lost a bit or 2 of accuracy. For compression this really isn't a problem.

Goz