views:

198

answers:

2

Is there a tool that can do alias analysis on a program and tell you where gcc / g++ are having to generate sub-optimal instruction sequences due to potential pointer aliasing?

+3  A: 

I don't know of anything that gives "100 %" coverage, but for vectorizing code (which aliasing often prevents) use the -ftree-vectorizer-verbose=n option, where n is an integer between 1 and 6. This prints out some info why a loop couldn't be vectorized.

For instance, with g++ 4.1, the code

//#define RSTR __restrict__
#define RSTR

void addvec(float* RSTR a, float* b, int n)
{
  for (int i = 0; i < n; i++)
    a[i] = a[i] + b[i];
}

results in

$ g++ -ftree-vectorizer-verbose=1 -ftree-vectorize -O3 -c aliastest.cpp

aliastest.cpp:6: note: vectorized 0 loops in function.

Now, switch to the other definition for RSTR and you get

$ g++ -ftree-vectorizer-verbose=1 -ftree-vectorize -O3 -c aliastest.cpp

aliastest.cpp:6: note: LOOP VECTORIZED.
aliastest.cpp:6: note: vectorized 1 loops in function.

Interestingly, if one switches to g++ 4.4, it can vectorize the first non-restrict case by versioning and a runtime check:

$ g++44 -ftree-vectorizer-verbose=1 -O3 -c aliastest.cpp

aliastest.cpp:6: note: created 1 versioning for alias checks.

aliastest.cpp:6: note: LOOP VECTORIZED.
aliastest.cpp:4: note: vectorized 1 loops in function.

And this is done for both of the RSTR definitons.

janneb
I tried it on a number of different examples and it didn't seem really helpful. For instance, I tried it on this program which demonstrates the performance benefits of `restrict`: http://stackoverflow.com/questions/1965487/does-the-restrict-keyword-provide-significant-anti-aliasing-benefits-in-gcc-g/1966649#1966649 `gcc -ftree-vectorizer-verbose=1 -ftree-vectorize -O3 -std=c99 -DUSE_RESTRICT restrict.c -o restrictrestrict.c:15: note: vectorized 0 loops in function.restrict.c:47: note: vectorized 0 loops in function.`
Robert S. Barnes
@Robert You can crank up the verbosity level if you want more info. Or -fdump-tree-alias to see what the compiler thinks about alias analysis. Or -fdump-tree-all for the whole shebang. For the example you quote, cranking up the verbosity shows a "no vectype for stmt:" message, meaning that the hardware doesn't support a suitable vector type. The solution, like I mentioned in a response to yet another of your restrict questions, is to specify -march=pentium-m and -mfpmath=sse. This doesn't as such help this example, as the main loop in that example cannot be vectorized, restrict or not.
janneb
+1  A: 

In the past I've tracked down cases aliasing slowdowns with some help from a profiler. Some of the game console profilers will highlight parts of the code that are causing lots of load-hit-store penalties - these can often occur because the compiler assumes some pointers are aliased and has to generate the extra load instructions. Once you know the part of the code they're occuring, you can backtrack from the assembly to the source to see what might be considered aliased, and add "restict" as needed (or other tricks to avoid the extra loads).

I'm not sure if there are any freely available profilers that will let you get into this level of detail, however.

The side benefit of this approach is that you only spend your time examining cases that actually slow your code down.

celion