views:

215

answers:

2

I am profiling my simple 2D XNA game. I found that 4% of entire running time is taken by simple operarion of adding together two Colors , one of them multiplied first by float.

I need to call this method rogulthy 2000 times per frame (for each tile on map), which gave me 120000 times per second for XNA's 60 fps. Even minimal boosting of single call whould give huge speed impact. Yet I simple do not know how can I make this more effective

    private void DoColorCalcs(float factor, Color color)
    {
        int mul = (int)Math.Max(Math.Min(factor * 255.0, 255.0), 0.0);
        tile.Color = new Color(
            (byte)Math.Min(tile.Color.R + (color.R * mul / 255), 255),
            (byte)Math.Min(tile.Color.G + (color.G * mul / 255), 255),
            (byte)Math.Min(tile.Color.B + (color.B * mul / 255), 255));

    }

EDIT: As suggested by Michael Stum:

    private void DoColorCalcs(float factor, Color color)
    {
        factor= (float)Math.Max(factor, 0.0);
        tile.Color = new Color(
            (byte)Math.Min(tile.Color.R + (color.R * factor), 255),
            (byte)Math.Min(tile.Color.G + (color.G * factor), 255),
            (byte)Math.Min(tile.Color.B + (color.B * factor), 255));
    }

This lowered time usage from 4% to 2.5%

+1  A: 

The obvious improvement would be to include the division operation (/ 255) in the calculation of mul, to reduce the divisions from 3 to a single division:

private void DoColorCalcs(float factor, Color color)
{
    float mul = Math.Max(Math.Min(factor * 255.0f, 255.0f), 0.0f) / 255f; 
    tile.Color = new Color(
        (byte)Math.Min(tile.Color.R + (color.R * mul), 255),
        (byte)Math.Min(tile.Color.G + (color.G * mul), 255),
        (byte)Math.Min(tile.Color.B + (color.B * mul), 255));
}

That being said, since you're replacing tile.Color, it may actually be faster to replace it in place instead of overwriting it (though I'd profile this to see if it helps):

private void DoColorCalcs(float factor, Color color)
{
    float mul = Math.Max(Math.Min(factor * 255.0f, 255.0f), 0.0f) / 255f;
    tile.Color.R = (byte)Math.Min(tile.Color.R + (color.R * mul), 255);
    tile.Color.G = (byte)Math.Min(tile.Color.G + (color.G * mul), 255);
    tile.Color.B = (byte)Math.Min(tile.Color.B + (color.B * mul), 255);
}

This prevents the recalculation of the alpha channel, and may reduce the amount of instructions a bit.

Reed Copsey
(int)Math.Max(Math.Min(factor * 255.0, 255.0), 0.0) / 255 will result in 0 in most cases. So this doesn't help.
Simon Ottenhaus
Agree with Simon Ottenhaus - you are dividing integer 0..255 by 255 which gave you 254 zeros and single value "one"
PiotrK
Ah, I forgot - I tried to replace creating new Color object by setting it's variables. It required a bit of "bad approach" as I had to change tile.Color from Accessor type to Variable. But still it had no speed impact (testing and rounded result to 0,1 fps)
PiotrK
can't you change mul to a float then? Or would that be too inaccurate?
Michael Stum
yeah, I goofed - fixed
Reed Copsey
You can do this, but you have to make mul a float. Since you're doing float-based math in every other computation, there should actually be a slight speed boost due to this.
Reed Copsey
I'd be shocked if the compiler didn't optimize away the division already.
jalf
Keep in mind that float operation take more time than integer calculations. If you use floats you should replace the division by a multiplication with 1/255 = 0,0039215686f. Most processors can do one single precision multiplication per clock cycle, but take much longer for a division.
Simon Ottenhaus
@PiotrK: Yeah, it's difficult to do this with structs as properties, but it can make a slight difference in some cases. It may not be enough here to really impact it, though, since it's just saving a single assignment.
Reed Copsey
@jalf: It won't, in his case, because of the casting. The compiler will optimize away certain things, but when you have mixed type math operations, it doesn't typically get optimized.
Reed Copsey
@Simon: The JIT usually makes that optimization for you ;) Constant division and multiplication with floats tends to get optimized out by the compiler and/or JIT (depending on the type)
Reed Copsey
A: 

My first question is, why floating point? If you're just scaling colors to fade them, you don't need a lot of precision. Example:

int factorTimes256;
tile.Color.R = Math.Max(255, tile.Color.R + (color.R * factorTimes256) / 256);
// same for G and B

In other words, represent your factor as an integer from 0 to 256, and do all the calculations on integers. You don't need more precision than 8 bits because the result is only 8 bits.

My second question is, did you say you went from 4% to 2.5% in this code? That's tiny. People who use profilers that only do instrumentation or sample the program counter are often satisfied with such small improvements. I bet you have other things going on that take a lot more time, that you could attack. Here's an example of what I mean.

Mike Dunlavey