views:

66

answers:

2

I'm having some trouble using SSE4.1 intrinsics on hardware that (I think) supports it. Can anyone tell me if I've missed something?

Building the following code on a MacBookPro5,4 (Penryn):

>g++ -msse sse4.cpp -S -o sse4.asm

#include <stdio.h>
#include <smmintrin.h>

int main ()
{
    __m128 a, b;
    const int mask = 0x55;

    a.m128_f32[0] = 1.5;
    a.m128_f32[1] = 10.25;
    a.m128_f32[2] = -11.0625;
    a.m128_f32[3] = 81.0;
    b.m128_f32[0] = -1.5;
    b.m128_f32[1] = 3.125;
    b.m128_f32[2] = -50.5;
    b.m128_f32[3] = 100.0;

    __m128 res = _mm_dp_ps(a, b, mask);

    printf_s("Original a: %f\t%f\t%f\t%f\nOriginal b: %f\t%f\t%f\t%f\n",
                a.m128_f32[0], a.m128_f32[1], a.m128_f32[2], a.m128_f32[3],
                b.m128_f32[0], b.m128_f32[1], b.m128_f32[2], b.m128_f32[3]);
    printf_s("Result res: %f\t%f\t%f\t%f\n",
                res.m128_f32[0], res.m128_f32[1], res.m128_f32[2], res.m128_f32[3]);

    return 0;
}

Generates the following error:

/usr/lib/gcc/i686-apple-darwin10/4.2.1/include/smmintrin.h:35:3: error: #error "SSE4.1 instruction set not enabled"
+2  A: 

Change:

g++ -msse sse4.cpp -S -o sse4.asm

to:

g++ -msse4.1 sse4.cpp -S -o sse4.asm

Paul R
Yup. Note there were other errors using the m128_f32[] accessor which appears to be Microsoft specific.
Justicle
@Justicle: Yes - just use the `_mm_set_XXX` macros to initialise SIMD data types.
Paul R
+1  A: 

Did you try g++ -msse4.1 ?

Pascal Cuoq