I try to build an application which uses pthreads and __m128 SSE type. According to GCC manual, default stack alignment is 16 bytes. In order to use __m128, the requirement is the 16-byte alignment.
My target CPU supports SSE. I use a GCC compiler which doesn't support runtime stack realignment (e.g. -mstackrealign). I cannot use any other GCC compiler version.
My test application looks like:
#include <xmmintrin.h>
#include <pthread.h>
void *f(void *x){
__m128 y;
...
}
int main(void){
pthread_t p;
pthread_create(&p, NULL, f, NULL);
}
The application generates an exception and exits. After a simple debugging (printf "%p", &y), I found that the variable y is not 16-byte aligned.
My question is: how can I realign the stack properly (16-byte) without using any GCC flags and attributes (they don't help)? Should I use GCC inline Assembler within this thread function f()?