tags:

views:

102

answers:

3

Hello,

I'm writing transpose function for 8x16bit vectors with SSE2 intrinsics. Since there are 8 arguments for that function (a matrix of 8x8x16bit size), I can't do anything but pass them by reference. Will that be optimized by the compiler (I mean, will these __m128i objects be passed in registers instead of stack)?

Code snippet:

inline void transpose (__m128i &a0, __m128i &a1, __m128i &a2, __m128i &a3,
                       __m128i &a4, __m128i &a5, __m128i &a6, __m128i &a7) {
    ....
    }
+2  A: 

Who can say?

Why not compile it and look at the disassembly? That is the only way to be sure.

Zan Lynx
A: 

Note that this limitation only applies to Windows and MSVC(++) (you should probably tag your question accordingly).

I haven't tried this with C++ and references, but using MSVC and pointers with inlines like this the compiler does appear to optimise away the indirection. Presumably the same will apply with C++ references, but as another poster pointed out, you should look at the generated code to check.

Paul R
I am not using MSVC
buratinas
@~buratinas: OK, you should be fine then so long as you're using a decent compiler, i.e. gcc or Intel's ICC.
Paul R
+2  A: 

Chances are that they will not be pushed to the stack. If the function is inline the compiler will actually push the operations (code) from the called function into the callee function instead of passing the data from the caller to the callee.

Now, inline is a hint, so the compiler can decide not to actually inline the call and then you would have to follow Zan's advice and actually check what the compiled code looks like.

David Rodríguez - dribeas