I'm running into an inconsistent optimization behavior with different compilers for the following code:
class tester
{
public:
tester(int* arr_, int sz_)
: arr(arr_), sz(sz_)
{}
int doadd()
{
sm = 0;
for (int n = 0; n < 1000; ++n)
{
for (int i = 0; i < sz; ++i)
{
sm += arr[i];
}
}
return sm;
}
protected:
int* arr;
int sz;
int sm;
};
The doadd
function simulates some intensive access to members (ignore the overflows in addition for this question). Compared with similar code implemented as a function:
int arradd(int* arr, int sz)
{
int sm = 0;
for (int n = 0; n < 1000; ++n)
{
for (int i = 0; i < sz; ++i)
{
sm += arr[i];
}
}
return sm;
}
The doadd
method runs about 1.5 times slower than the arradd
function when compiled in Release mode with Visual C++ 2008. When I modify the doadd
method to be as follows (aliasing all members with locals):
int doadd()
{
int mysm = 0;
int* myarr = arr;
int mysz = sz;
for (int n = 0; n < 1000; ++n)
{
for (int i = 0; i < mysz; ++i)
{
mysm += myarr[i];
}
}
sm = mysm;
return sm;
}
Runtimes become roughly the same. Am I right in concluding that this is a missing optimization by the Visual C++ compiler? g++
seems to do it better and run both the member function and the normal function at the same speed when compiling with -O2
or -O3
.
The benchmarking is done by invoking the doadd
member and arradd
function on some sufficiently large array (a few millions of integers in size).
EDIT: Some fine-grained testing shows that the main culprit is the sm
member. Replacing all others by local versions still makes the runtime long, but once I replace sm
by mysm
the runtime becomes equal to the function version.
Resolution
Dissapointed with the answers (sorry guys), I shaked off my laziness and dove into the disassembly listings for this code. My answer below summarizes the findings. In short: it has nothing to do with aliasing, it has all to do with loop unrolling, and with some strange heuristics MSVC applies when deciding which loop to unroll.