views:

49

answers:

3

We have a binary compiled with SSE3 optimizations which end up using the instruction LDDQU. Now when this code is executed on a Windows system (Single core, XP2) which has only SSE1,2 support (as seen through CPU-Z tool) then application crashes.

(924.4f0): Invalid lock sequence - code c000001e (first chance) ... 001700a10 f20ff00430 lddqu xmm0,xmmword ptr [eax+esi] ds:0023:1e08d200=270a57364a4a77896db676459d8c40a9 ...

Can some one enlighten me what does this crash signify and possible fixes?

A: 

It's the hardware that encounters an instruction it doesn't know. Just like you can't let an motorola chip execute x86 code, this processor doesn't recognise the LDDQU instruction.

The CPU will raise an interrupt, which is handled by the OS, and translated to the error message you got.

What could you do? You can only build your binary for the 'lower level' platform, too. Probably the target "x86" will do. The compiler will then emit x86-compliant code. You may want to release your software in two versions: the 'optimized' and the 'compatible'.

xtofl
A: 

It should raise an exception, generally EXCEPTION_ILLEGAL_INSTRUCTION from msdn:

EXCEPTION_ILLEGAL_INSTRUCTION The thread tried to execute an invalid instruction.

however, in your case the CPU couldn't properly interperate the execution stream and broke it into smaller pieces, leading to undefined behavoir(in this a some instruction got a LOCK prefix added to it, from a residual byte from the SSE3 instruction, but it doesn't support a LOCK prefix and singals an exception). there is really nothing to be done other than making an SSE2 version, or testing for the SSE falgs and braching the code based on what is supported)

Necrolis
+2  A: 

An application is compiled with SSE3 support and crashes when run on a CPU not supporting SSE3. Gee, so strange! Compiler options for choosing an instruction set must be there just because some programmer at Microsoft was bored as hell one day.

You have several options:

  • make a single version of the application using SSE2 instruction set only
  • make different versions of the application compiled with different instruction sets
  • use structured exception handling (SEH) to implement user-mode emulation of unsupported instructions.

The last approach is a bit more time-consuming than the first two, has some performance issues, but those downsides are much smaller than the advantages it gives you. If you choose the third solution, you will also be able to invent your own opcodes! Perfect way for obfuscating program control flow, which is again very useful for hindering reverse-engineering of your program and thus protectnig your IP.

zvrba
There's at least one extra option, one that's actively implemented by Intel's compiler: * make a single version of the application, using both SSE2 and SSE3 code paths. At runtime, branch depending on the processor.
MSalters