views:

168

answers:

4

I have one strange problem. I have following piece of code:

template<clss index, class policy>
inline int CBase<index,policy>::func(const A& test_in, int* srcPtr ,int* dstPtr)
{
    int width = test_in.width();
    int height = test_in.height();

    double d = 0.0; //here is the problem
    for(int y = 0; y < height; y++)
    {

        //Pointer initializations

        //multiplication involving y
        //ex: int z = someBigNumber*y + someOtherBigNumber;
        for(int x = 0; x < width; x++)
        {
            //multiplication involving x
        //ex: int z = someBigNumber*x + someOtherBigNumber;
            if(soemCondition)
            {
                // floating point calculations
            }
            *dstPtr++ = array[*srcPtr++];
        }
    }
}

The inner loop gets executed nearly 200,000 times and the entire function takes 100 ms for completion. ( profiled using AQTimer)

I found an unused variable double d = 0.0; outside the outer loop and removed the same. After this change, suddenly the method is taking 500ms for the same number of executions. ( 5 times slower).

This behavior is reproducible in different machines with different processor types. (Core2, dualcore processors).

I am using VC6 compiler with optimization level O2. Follwing are the other compiler options used :

 -MD -O2 -Z7 -GR -GX -G5 -X -GF -EHa

I suspected compiler optimizations and removed the compiler optimization /O2. After that function became normal and it is taking 100ms as old code.

Could anyone throw some light on this strange behavior?

Why compiler optimization should slow down performance when I remove unused variable ?

Note: The assembly code (before and after the change) looked same.

+4  A: 

If the assembly code looks the same before and after the change the error is somehow connected to how you time the function.

Andreas Brinck
+1  A: 

Declare width and height as const {unsigned} ints. {The unsigned should be used since heights and widths are never negative.}

const int width = test_in.width();
const int height = test_in.height();

This helps the compiler with optimizing. With the values as const, it can place them in the code or in registers, knowing that they won't change. Also, it relieves the compiler of having to guess whether the variables are changing or not.

I suggest printing out the assembly code of the versions with the unused double and without. This will give you an insight into the compiler's thought process.

Thomas Matthews
+3  A: 

VC6 is buggy as hell. It is known to generate incorrect code in several cases, and its optimizer isn't all that advanced either. The compiler is over a decade old, and hasn't even been supported for many years.

So really, the answer is "you're using a buggy compiler. Expect buggy behavior, especially when optimizations are enabled."

I don't suppose upgrading to a modern compiler (or simply testing the code on one) is an option?

Obviously, the generated assembly can not be the same, or there would be no performance difference.

The only question is where the difference lies. And with a buggy compiler, it may well be some completely unrelated part of the code that suddenly gets compiled differently and breaks. Most likely though, the assembly code generated for this function is not the same, and the differences are just so subtle you didn't notice them.

jalf
Going to take a wild guess and say the double was forcing later floating point routines to be stack aligned at 8 bytes, which can markedly improve performance when using doubles.
Yann Ramin
yeah, that's possible. Alignment issues would probably account for a slowdown of this magnitude
jalf
unfortunately its a huge legacy project. Lot of effort involved in porting to latest VC :(
aJ
@aJ: that sucks. But it might be worth *starting* on the transition (or at least, try to convince your bosses of it). Of course it'll take time, and in the meantime you'll probably have to maintain the VC6 version as well, but sticking with VC6 causes you constant and needless pain as well. Anyway, I feel your pain. ;)
jalf
A: 

You can get VC10 Express for free.

DeadMG