views:

733

answers:

3

Is it better in some sense to vectorize code by hand, using explicit pragmas or to rely on or use auto-vectorization? For optimum performance using auto-vectorization, one would have to monitor the compiler output to ensure that loops are being vectorized or modify them until they are vectorizable.

With hand coding, one is certain that the desired instructions are being emitted, but now the code is likely not portable (either to other architectures or other compilers).

+10  A: 

Auto vectorization never worked out well for me. To me it seems like auto-vectorization only works for very trivial loops at the moment.

I use the pragma/intrinsic approach and take a look at the assembly. If the compiler generates bad code (like spilling SSE registes onto the stack or adding redundant moves) I use inline assembler for the whole loop body.

Portability is btw not a problem. Often you start with a C/C++ loop and optimize it using intrinsics. Just keep the old loop and use it as a unit-test / fallback for your SIMD implementation. Also it's always wise to be able to remove all SIMD code from a project via a compile-time define. Debugging an application is much easier that way. The same define can be used for cross-compilation.

Nils Pipenbrinck
+2  A: 

I would never rely on automatic vectorization from any compiler. With gcc I would be doubly wary because the effects of gcc's optimizations always vary from version to version. Almost everyone I know who relies on special optimizations or gcc extensions has to deal with breakage when a new gcc version is released.

You can usually trust pragmas and intrinsics, but you should keep a sharp eye on release notes for new gcc versions, and you should tell your own users what gcc version is needed to compile your code.

Once or twice when vectorization really mattered, we've added something to the test suite to call objdump and verify that vector instructions are actually being used. It would be nice to be able to detect 'bad vector code' (as Nils describes) automatically as well, but we've never gotten that far.

Norman Ramsey
A: 

I've yet to see an automatic vectorizer that does more good than harm.

Crashworks