views:

856

answers:

6

As the title states, why would one use "movl $1, %eax" as opposed to, say, "movb $1, %eax", I was told that movl would zero out the high order bits of %eax, but isn't %eax a register that's equivalent to the size of the system's wordsize? meaning that movl is in actuality an integer operation (and not a long?)

I'm clearly a bit confused about it all; Thanks.

+2  A: 

long was originally 32 bits, while int and short were 16. And the names of the opcodes don't change every time someone comes out with a new operating system.

Anon.
What has a release of an operating system got to do with the naming conventions of types?
Negative Acknowledgement
It was a figure of speech. The intent was to indicate that once published, the instruction set specification isn't destructively changed.
Anon.
This is why I prefer Intel syntax to GAS syntax. Intel: mov dword eax, ecxGAS: movl eax, ecxThe "l" size operand confuses people. More people get that 'dword' means 32-bits.
Jason
Looks all the same to me, a dword (double word) suggests that the word size is 16 bits...
Negative Acknowledgement
The word size *was* 16 bits on the 8086.
Anon.
Uh, isn't x86 equivalent to 8086? isn't this what we're dealing with, and isn't it a 32 bit architecture?
Negative Acknowledgement
The 8086 processor was quite definitively 16-bit.
Anon.
+2  A: 

%eax is a 32-bit register. To use a smaller width, you need %ax for 16-bits. %ax can be further divided into %ah for the high byte of %ax, and %al for the lower byte. The same goes for the other x86 GPRs.

Looking at the Intel instruction set reference for the mov instruction, I don't see a variant that can move a single byte into a 32-bit register -- it's probably interpreted as a move into %al.

Since movl is a 32-bit instruction, the values for the upper bytes will correspond to zeros in the case of an immediate value. If you were moving from memory you would be moving an entire 32-bit word.

%eax is not zeroed out unless you either movl $0, %eax, or if you xorl %eax, %eax. Otherwise it holds whatever value was previously in there. When you movl $1, %eax, you will end up with 0x00000001 in the register because the 32-bit instruction moves a 32-bit immediate value into the register.

Aaron Klotz
That movl is a 32 bit instruction is partly the answer I was looking for, but another answer I'm asking for is whether or not %eax by default is _not_ zeroed out, or is undefied?
Negative Acknowledgement
%eax is never zeroed out unless you explicitly do so. If you use movb to just write into one of the bytes, the other bytes are unchanged.
Victor Shnayder
+4  A: 

On a 32 bit machine, %eax is a 4 byte (32 bit) register. movl will write into all 4 bytes. In your example, it'll zero out the upper 3 bytes, and put 1 in the lowest byte. The movb will just change the low order byte.

Victor Shnayder
So movl is a 4 byte operation?
Negative Acknowledgement
Not sure which assembler this is, but isn't the only difference the size of the constant?
peterchen
Yes. In the IA32 architecture (usually called x86), "long" means 4 bytes, and movl stands for "move long".
Victor Shnayder
@Negative Acknowledgement: You're asking the wrong question. Every operation on a system is in multiples of the word size. On a 32-bit system, every operation works on 32 bits. The way the shorter (`movb`) instruction works is by AND'ing the first byte then adding your value, thus it is slow. You generally always want to work on full words or you'll incur speed penalties (at the cost of more memory used, of course).
Blindy
A: 

%eax is 32 bits on 32-bit machines. %ax is 16 bits, and %ah and %al are its 8-bit high and low constituents.

Therefore movl is perfectly valid here. Efficiency-wise, movl will be as fast as movb, and zeroing out the high 3 bytes of %eax is often a desirable property. You might want to use it as a 32-bit value later, so movb isn't a good way to move a byte there.

Eli Bendersky
+2  A: 

opposed to, say, "movb $1, %eax"

This instruction is invalid. You can't use eax with the movb instruction. You would instead use an 8-bit register. For example:

movb $1, $al

but isn't %eax a register that's equivalent to the size of the system's wordsize?

No. EAX will always be a 32-bit value, regardless of the system's register size.

You are confusing C variable sizes with register sizes. C variable sizes may change depending on your system and compiler.

Assembly is simpler than C. In GAS assembly, instructions are suffixed with the letters "b", "s", "w", "l", "q" or "t" to determine what size operand is being manipulated.

* b = byte (8 bit)
* s = short (16 bit integer) or single (32-bit floating point)
* w = word (16 bit)
* l = long (32 bit integer or 64-bit floating point)
* q = quad (64 bit)
* t = ten bytes (80-bit floating point)

These sizes are constant. They will never be changed. al is always 8-bits and eax is always 32-bits.

Jason
Nice answer, thanks!
Negative Acknowledgement
+1  A: 

Your second choice will just produce an error, x86 doesn't have that instruction. X86 is a bit unique with respect to loading bytes into certain registers. Yes, on most instruction set architectures the operand is zero or sign-extended, but x86 allows you to write just the lower byte or lower 16 bits of some of them.

There are certainly other choices, like clearing the register and then incrementing it, but here are three initially reasonable-looking choices you have:

   0:   b8 01 00 00 00          movl   $0x1,%eax

   5:   31 c0                   xorl   %eax,%eax
   7:   b0 01                   movb   $0x1,%al

   9:   b0 01                   movb   $0x1,%al
   b:   0f b6 c0                movzbl %al,%eax

The first is 5 bytes, the second 4, the third 5. So the second is the best choice if optimizing for space, otherwise I suppose the one most likely to run fast is the first one. X86 is deeply pipelined these days, so the two instructions will interlock and the machine may need quite a few wait states depending on details of the pipeline hardware.

Of course, these x86 ops are being translated in CPU-specific ways into CPU micro-ops, and so who knows what will happen.

DigitalRoss
Very good answer, very explanatory. Thanks
Negative Acknowledgement