views:

1003

answers:

7

How can I convert a BYTE buffer (from 0 to 255) to a float buffer (from 0.0 to 1.0)? Of course there should be a relation between the two values, eg: 0 in byte buffer will be .0.f in float buffer, 128 in byte buffer will be .5f in float buffer, 255 in byte buffer will be 1.f in float buffer.

Actually this is the code that I have:

for (int y=0;y<height;y++) {
 for (int x=0;x<width;x++) {
  float* floatpixel = floatbuffer + (y * width + x) * 4;
  BYTE* bytepixel = (bytebuffer + (y * width + x) * 4);
  floatpixel[0] = bytepixel[0]/255.f;
  floatpixel[1] = bytepixel[1]/255.f;
  floatpixel[2] = bytepixel[2]/255.f;
  floatpixel[3] = 1.0f; // A
 }
}

This runs very slow. A friend of mine suggested me to use a conversion table, but I wanted to know if someone else can give me another approach.

Veehmot.

+1  A: 

but I wanted to know if someone else can give me another approach.

Essentially, no. Use the conversion table. It's a cheap, efficient solution.

Konrad Rudolph
+2  A: 

Use a static lookup table for this. When I worked in a computer graphics company we ended up having a hard coded lookup table for this that we linked in with the project.

Mats Fredriksson
+5  A: 

Whether you choose to use a lookup table or not, your code is doing a lot of work each loop iteration that it really does not need to - likely enough to overshadow the cost of the convert and multiply.

Declare your pointers restrict, and pointers you only read from const. Multiply by 1/255th instead of dividing by 255. Don't calculate the pointers in each iteration of the inner loop, just calculate initial values and increment them. Unroll the inner loop a few times. Use vector SIMD operations if your target supports it. Don't increment and compare with maximum, decrement and compare with zero instead.

Something like

float* restrict floatpixel = floatbuffer;
BYTE const* restrict bytepixel = bytebuffer;
for( int size = width*height; size > 0; --size )
{
    floatpixel[0] = bytepixel[0]*(1.f/255.f);
    floatpixel[1] = bytepixel[1]*(1.f/255.f);
    floatpixel[2] = bytepixel[2]*(1.f/255.f);
    floatpixel[3] = 1.0f; // A
    floatpixel += 4;
    bytepixel += 4;
}

would be a start.

moonshadow
Some very good suggestions. But they won't beat a lookup table. ;-)
Konrad Rudolph
Depends on the architecture. Multiply and convert might be cheaper than load, especially if he can use his architecture's SIMD capabilities (MMX, SSE, Altivec or whatever) to do it on the whole pixel in a single instruction. But that decision can be taken independently of all of the above suggestions.
moonshadow
This will do more to make compiler's job easier than to actually improve speed. Except aligning pointers and enabling SIMD - it can give a real boost
ima
I accept this because it's the only answer which didn't mention Lookup Tables, what I already know. I just wanted another approach, and this is the answer.
Veehmot
+1  A: 

Yes, a lookup table is definitely faster than doing a lot of divisions in a loop. Just generate a table of 256 precomputed float values and use the byte value to index that table.

You can also optimize the loop a little by removing the index computation and just do something like

float *floatpixel = floatbuffer;
BYTE *bytepixel = bytebuffer;

for (...) {
  *floatpixel++ = float_table[*bytepixel++];
  *floatpixel++ = float_table[*bytepixel++];
  *floatpixel++ = float_table[*bytepixel++];
  *floatpixel++ = 1.0f;
}
laalto
+2  A: 

You need to find out what the bottleneck is:

  • if you iterate your data tables in the 'wrong' direction, you constantly hit a cache miss. No lookup will ever help get around that.
  • if your processor is slower in scaling than in looking up, you can boost performance by looking up, provided the lookup table fits it's cache.

Another tip:

struct Scale {
    BYTE operator()( const float f ) const { return f * 1./255; }
};
std::transform( float_table, float_table + itssize, floatpixel, Scale() );
xtofl
A: 

Don't calculate 1/255 every time. Don't know if a compiler will be smart enough to remove this. Calculate it once and reapply it every time. Even better, define it as a constant.

Rodyland
Compilers perform constant folding so this is not an issue.
Konrad Rudolph
A: 

Look-up table is the fastest way to convert :) Here you go:

Python code to generate the byte_to_float.h file to include:

#!/usr/bin/env python

def main():
    print "static const float byte_to_float[] = {"

    for ii in range(0, 255):
        print "%sf," % (ii/255.0)

    print "1.0f };"    
    return 0

if __name__ == "__main__":
    main()

And C++ code to get the conversion:

floatpixel[0] = byte_to_float[ bytepixel[0] ];

Simple isn't it?

Viet