



How can I convert a BYTE buffer (from 0 to 255) to a float buffer (from 0.0 to 1.0)? Of course there should be a relation between the two values, eg: 0 in byte buffer will be .0.f in float buffer, 128 in byte buffer will be .5f in float buffer, 255 in byte buffer will be 1.f in float buffer.

Actually this is the code that I have:

for (int y=0;y<height;y++) {
 for (int x=0;x<width;x++) {
  float* floatpixel = floatbuffer + (y * width + x) * 4;
  BYTE* bytepixel = (bytebuffer + (y * width + x) * 4);
  floatpixel[0] = bytepixel[0]/255.f;
  floatpixel[1] = bytepixel[1]/255.f;
  floatpixel[2] = bytepixel[2]/255.f;
  floatpixel[3] = 1.0f; // A

This runs very slow. A friend of mine suggested me to use a conversion table, but I wanted to know if someone else can give me another approach.


Essentially, no. Use the conversion table. It's a cheap, efficient solution.

Use a static lookup table for this. When I worked in a computer graphics company we ended up having a hard coded lookup table for this that we linked in with the project.

Whether you choose to use a lookup table or not, your code is doing a lot of work each loop iteration that it really does not need to - likely enough to overshadow the cost of the convert and multiply.

Declare your pointers restrict, and pointers you only read from const. Multiply by 1/255th instead of dividing by 255. Don't calculate the pointers in each iteration of the inner loop, just calculate initial values and increment them. Unroll the inner loop a few times. Use vector SIMD operations if your target supports it. Don't increment and compare with maximum, decrement and compare with zero instead.

Something like

float* restrict floatpixel = floatbuffer;
BYTE const* restrict bytepixel = bytebuffer;
for( int size = width*height; size > 0; --size )
    floatpixel[0] = bytepixel[0]*(1.f/255.f);
    floatpixel[1] = bytepixel[1]*(1.f/255.f);
    floatpixel[2] = bytepixel[2]*(1.f/255.f);
    floatpixel[3] = 1.0f; // A
    floatpixel += 4;
    bytepixel += 4;

would be a start.

Yes, a lookup table is definitely faster than doing a lot of divisions in a loop. Just generate a table of 256 precomputed float values and use the byte value to index that table.

You can also optimize the loop a little by removing the index computation and just do something like

float *floatpixel = floatbuffer;
BYTE *bytepixel = bytebuffer;

for (...) {
  *floatpixel++ = float_table[*bytepixel++];
  *floatpixel++ = float_table[*bytepixel++];
  *floatpixel++ = float_table[*bytepixel++];
  *floatpixel++ = 1.0f;
You need to find out what the bottleneck is:

  • if you iterate your data tables in the 'wrong' direction, you constantly hit a cache miss. No lookup will ever help get around that.
  • if your processor is slower in scaling than in looking up, you can boost performance by looking up, provided the lookup table fits it's cache.

Another tip:

struct Scale {
    BYTE operator()( const float f ) const { return f * 1./255; }
std::transform( float_table, float_table + itssize, floatpixel, Scale() );

Don't calculate 1/255 every time. Don't know if a compiler will be smart enough to remove this. Calculate it once and reapply it every time. Even better, define it as a constant.

Compilers perform constant folding so this is not an issue.
Look-up table is the fastest way to convert :) Here you go:

Python code to generate the byte_to_float.h file to include:

#!/usr/bin/env python

def main():
    print "static const float byte_to_float[] = {"

    for ii in range(0, 255):
        print "%sf," % (ii/255.0)

    print "1.0f };"    
    return 0

if __name__ == "__main__":

And C++ code to get the conversion:

floatpixel[0] = byte_to_float[ bytepixel[0] ];

Simple isn't it?
