ansaurus

Question

Answer 1

+5 A:

Check the disassembly. Are you sure that the compiler inserted the instruction? In the Remarks section there is this text:

This function can be implemented by calling a runtime function.

I suspect that's what's happening in your case.

Note that the CLZ instruction is only available in ARMv5 and later. You need to tell the compiler if you want ARMv5 code:

/QRarch5 ARM5 Architecture
/QRarch5T ARM5T Architecture

(Microsoft incorrectly uses "ARM5" instead of "ARMv5")

Igor Skochinsky 2010-10-28 16:45:20

I'll check the generated code. And if it is really a function call (instead of a true intrinsic) the call overhead would explain why it is so slow.

Lorenzo 2010-10-28 19:02:44

Leading zeros calculation with intrinsic function