




Hello. I've been looking for the answer 'how to use 'BSWAP' for lower 32-bit sub-register of 64-bit register.' For example, "0x0123456789abcdef" is inside RAX register, and I want to change it to "0x01234567efcdab89" with a single instruction. (because of performance) So I tried following inline function:

#define BSWAP(T) {  \
    __asm__ __volatile__ ( \
      "bswap %k0" \
      : "=q" (T) \
      : "q" (T)); \

And the result was "0x00000000efcdab89". TT-TT I, actually, don't understand why the compiler acts like this. Does anybody know the efficient solution???

- A Korean boy, who haven't slept for 2 days to solve this problem. =_=

Check the assembly output generated by gcc! Use the gcc -s flag to compile the code and generate asm output.

IIRC, x86-64 uses 32-bit integers by default when not explicitly directed to do otherwise, so this may be (part of) the problem.


Thanks for your answer.

Um.. In the asm code, it was using 32-bit register, namely %ebx, as I intended. (you know, 'k' suffix in the GCC inline assembly code means it's 32-bit register.) And I tested some experiments with the asm code, but couldn't find the answer which is how to change the lower 32-bit sub-register without changing the upper 32-bit sub-register of 64-bit register. TT-TT So I thought that changing a specific part of register is not possible. However, the interesting one is that it works with lower 16-bit sub-register. Um.. I think it's better to say with examples. (Sorry, my English isn't good enough to talk clearly...-.-;;)

One of my tests was done by using ror instruction.

Suppose that a 64-bit variable has the value of "0x0123456789abcdef". (and it's in the rbx register.)

When the asm code was "ror $8, %rbx", the result was, of course, "0xef012345678abcd".

When the asm code was "ror $8, %ebx", The expected value was "0x01234567ef89abcd", but the result was, (TToTT), "0x00000000ef89abcd". The upper 32-bit value was gone. So I thought I can't change the only part of register.

However, when the asm code was "ror $8, %bx", the result was, (@o@), "0x0123456789abefcd", instead of "0x000000000000efcd". That's why I'm so confused.

+1  A: 

Ah, yes, I understand the problem now:

the x86-64 processors implicitly zero-extend the 32-bit registers to 64-bit when doing 32-bit operations (on %eax, %ebx, etc). This is to maintain compatibility with legacy code that expects 32-bit semantics for these registers, as I understand it.

So I'm afraid that there is no way to do ror on just the lower 32 bits of a 64-bit register. You'll have to do use a series of several instructions...


Thanks a lot. Now, everything is clear. (^___^)