views:

48

answers:

1

Shark told me this:

This instruction is the start of a loop that is not aligned to a 16-byte address boundary. For optimal performance, you should align the start of a hot loop using a compiler directive. With gcc 3.3 or later, use the -falign-loops=16 compiler flag.

for (int i=0; i < 4; i++) { // line with the info
   //...code
}

How would I set that flag, and does it really improve performance?

+3  A: 

The hints from Shark are not always appropriate. Aligning loops doesn't make a lot of difference in most cases. Focus on the bottlenecks in your code and see what you can do at the algo/code level before resorting to very minor tweaks such as this.

Paul R