views:

84

answers:

2

I'm trying out g++ inline assembly and sse and wrote a first program. It segfaults - why?

#include <stdio.h>

float s[128*4] __attribute__((aligned(16)));

#define r0 3
#define r1 17
#define r2 110
#define rs0 "3"
#define rs1 "17"
#define rs2 "110"

int main () {
  s[r0*4+0] = 2.0;  s[r0*4+1] = 3.0;  s[r0*4+2] = 4.0;  s[r0*4+3] = 5.0;
  s[r1*4+0] = 3.5;  s[r1*4+1] = 3.5;  s[r1*4+2] = 3.5;  s[r1*4+3] = 3.5;
  asm (
    "\n\t  .intel_syntax noprefix"

    "\n\t  mov     edx,                s"
    "\n\t  movaps  xmm0,               [edx + " rs0 "*16]"
    "\n\t  movaps  xmm1,               [edx + " rs1 "*16]"
    "\n\t  mulps   xmm0,               xmm1"
    "\n\t  movaps  [edx + " rs2 "*16], xmm0"

    "\n\t  .att_syntax"
  );
  printf ("%f %f %f %f\n", s[r2*4+0], s[r2*4+1], s[r2*4+2], s[r2*4+3]);
}

And why doesn't gdb allow me to single-step the assembly instructions? Do I need to write asm ("..") around every line?

+1  A: 

You can use stepi or si to step single machine instructions. Many other functions are available like this with the -i suffix, such as nexti.

Matt Joiner
That's what I tried, but I can't see the assembly instructions in gdb while stepping. It only prints out the final closing ')' bracket of the asm-block.
Thomas
@Thomas: Try the disassemble command.
Matt Joiner
+2  A: 

You're loading the data at s[0] into %edx and using it as a pointer. When you then try to access %edx + 0x30, you crash, because s[0] + 48 is not mapped for your process to read from. (Specifically, since s is global and therefore initialized to all zeros, you're trying to load from the address 0x30)

Stephen Canon
Oh. I wanted `mov edx, s` to load the address as an immediate. I'll try to find out the correct instruction or syntax...
Thomas
You probably want to be a bit careful about s. It's a global symbol and might be subject to relocation at link/load time. Having said that, I've never bothered with inline assembler, sio I might be talking rubbish.
JeremyP
The correct syntax is: `offset s` - and now it works. Thanks!
Thomas
Oh, it doesn't: It prints: `7.000000 10.500000 14.000000 0.000000` ... Where is the last value?
Thomas
Solution: `float` -> `volatile float`
Thomas
@Thomas: You should be using [input/output directives / clobber lists](http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.2) rather than declaring the array `volatile`.
caf