ansaurus

Question

implement SIMD in C++

Answer 1

A:

If you're using GCC, see http://gcc.gnu.org/projects/tree-ssa/vectorization.html for how to help the compiler auto-vectorize your code, and examples.

Otherwise, you need to let us know what platform you are using.

Potatoswatter 2010-04-29 16:56:56

This will be run on a Linux box, but using Intel's compiler I believe. If it helps, I have to run the following commands before I do anything to make sure the compiler works... source /opt/intel/Compiler/11.1/064/bin/intel64/iccvars_intel64.csh; source /opt/intel/tbb/2.2/bin/intel64/tbbvars.csh ... and then to compile I do: icc -ltbb test.cxx -o test

Hristo 2010-04-29 16:59:08

http://www.advogato.org/article/871.html is old but looks quite relevant. `-xW -O2 -vec-report3`. And see `man icc` and search for `vector`.

Potatoswatter 2010-04-29 17:11:33

Answer 2

+1 A:

When you want to use assembly language within a C++ module, you can just put it inside an asm block, and continue to use your variable names from outside the block. The assembly instructions you use within the asm block will specify which register etc. is being operated on, but they will vary by platform.

jwismar 2010-04-29 16:59:01

Can you show me an example?

Hristo 2010-04-29 16:59:36

Answer 3

+1 A:

Your question represents some confusion on what is going on. The i,j,k variables are almost certainly held in registers already, assuming you are compiling with optimizations on (which you should do - add "-O2" to your icc invocation).

You can use an asm block, but an easier method considering you're already using ICC is to use the SSE intrinsics. Intel's documentation for them is here - http://www.intel.com/software/products/compilers/clin/docs/ug_cpp/comm1019.htm

It looks like you can SIMD-ize the top-level loop, though it's going to depend greatly on what your delta function is.

Jack Lloyd 2010-04-29 17:08:50

I have no control over the icc invocation. This is a homework assignment so I'm very limited in what I can do. I can't even edit the delta function which can totally be optimized. I'll fiddle around with the asm block idea. Thanks.

Hristo 2010-04-29 17:16:23

@Hristo: Intrinsics should give you less trouble than the asm block. But do look into auto-vectorization. You should be able to find `#pragma` commands that emulate control over command-line flags.

Potatoswatter 2010-04-29 20:24:32

Answer 4

A:

The compiler should be doing this for you. For example, in VC++ you can simply turn on SSE2.

DeadMG 2010-04-29 17:42:49

ansaurus

tags:

views:

answers:

implement SIMD in C++

related questions