views:

3442

answers:

1

In assembly language if we use

mov eax, dword ptr[ebx]

then it means copy the value pointed by ebx (ebx contains the address value, not the actual value, this instruction copies the actual value in the address)?

If we use

mov eax, dword ptr[some_variable]

then it means copy the value of variable "some_variable" itself to eax, not copy the value pointed by variable "some_variable"?

Is my understanding correct?

If yes, I'm confused why the same assembly instruction has two different meansings - in the first case there is a level of indirection, but in the second there is no additional level of indirection.

Any comment?

EDIT:

Not every [] does not taking any effect, for example, instruction xchg will take a level of in-direction, which loads value pointed by edx.

Whole source code could be found from,

http://www.codeproject.com/KB/threads/spinlocks.aspx

#ifdef WIN32
inline int CPP_SpinLock::TestAndSet(int* targetAddress, int nValue)
{
    __asm {
        mov edx, dword ptr [pTargetAddress]
        mov eax, nValue
        lock xchg eax, dword ptr [edx]
    }
}
#endif // WIN32
+2  A: 

In both cases you ask the processor to move the value from a specified address. It's one level of indirection. In the first case you ask it to take the address from a specified register. In the second case you specify an offset directly.

x86 processors don't support dual level indirection, so it's not possible to request to load a value from an address specified somewhere in memory - you have to load the address onto a register.

Under a number of assemblers (MASM and built into VC++ assembler for example) you could as well write just

mov eax, dword ptr some_variable

without brackets, it would mean the same.

You could write

move eax, dword ptr [variable][ebx]

this would instruct to take the address of "variable", then add value of ebx and use the sum as an address from which to load a value. This is often used for accessing array elements by index.

In all these cases the processor would do the same - load a value from a specified address. It's one level of indirection each time.

sharptooth
"without brackets, it would mean the same." - that depends on the assembler used. MASM allows this, nasm/yasm/fasm doesn't (and imho with good reason - you *are* doing indirection when you read a variable, best to be explicit about it. That's what assembly is about, isn't it?)
snemarch
Yes, sure. One misuderstanding often equals long time debugging.
sharptooth
Hi sharptooth, if add [] and without adding [], the effects are the same, why people will typing additional []? I think most developers are lazy. :-)
George2
As snemarch mentiones higher in comments this provides cleaner and more explicit code. Without brackets you may change the "some_variable" identifier to a constant ("equ") and the assembler will produce different results. And those who maintain the code have lots of pain from this moment.
sharptooth
@George2> also, assemblers that *insist* on indirection brackets often use lack of brackets to take the offset of a variable. Ie, "mov eax, myvariable" in {n,f}asm is equivalent to "mov eax, offset myvariable" in masm.
snemarch
Didn't know this particular detail. But this means the code without brackets will compile differently on assemblers with such different ways of interpretation. And it can lead to numerous bugs and many many hours of debugging.
sharptooth
@sharptooth, not every [] does nothing. [] xchg takes an additional level of indirection. Please check my new editted part in my original post. Any comments?
George2
I am using Windows platform, using Visual Studio. I am talking in the context of assembly generated by this platform/context.
George2
xchg doesn't take any additional level of indirection. First a value from a specified address is loaded to edx, then this value is itsself used as an address in xchg. It's one level of indirection each time - the first one when doing 'mov', the second one when doing 'xchg'.
sharptooth
It's one level of indirection each time.
sharptooth
@sharptooth, I think the rule is if dword ptr is followed by register, then the register's value is treated as address, and there is one level of in-direction (loads the value of the address), if dword ptr is followed by a variable, there is no such level of indirection, the variable value is loaded
George2
(continue), you could reference here, http://msdn.microsoft.com/en-us/library/56638b75(VS.71).aspx DWORD PTR [bp] means *(unsigned long *) ebp, there is an addtional * at the beginning. So, in short, I think it depends whether register or variable is used in [] to make decision whether there is
George2
(continued) an additional level of indirection. Indirection I mean load the value itself in [] or load the value pointed by the value in []. Any comments?
George2
It does the same thing both time - loads a value from some address. You're right that you can specify a register there, but that's not an additional level of indirection, since you just specify which register of a small set to use.
sharptooth
It doesn't load "the value itself". Which value? The offset is obtained with the offset keyword and its a compile-time constant.
sharptooth
@sharptooth, let me confirm with you. In the case of mov, eax, dword ptr [some_variable], suppose the value of some_variable is 0x100, the 0x100 is loaded into eax, why there is an additional level of indirection? But in the case of mov eax, dword ptr[edx], suppose edx is 0x100, then (continue)
George2
then the value will be loaded from address 0x100, and assign the value to eax, not the 0x100 itself. So it is what I mean there is one additional level of indirection in the case of using register in [], but no such level of indirection in the case of using variable in []. Any comments?
George2
It's the same for processor. You could write "mov eax, 100" and the processor would copy the value "100". When you use dword_ptr you tell it to load the value from an address. This is where indirection is.
sharptooth
@sharptooth, let me explain my understanding to clarify our confusions, in the example of mov eax, dword ptr[some_variable], if the "some_variable"'s actual value is 0x100, the indirection you mean finds address of "some_variable", then loads from the address to find 0x100, and assign to eax,
George2
(continued) is my understanding correct about what you mean one level of indirection in your sample?
George2
Yeap, the indirection is in the "get the value from the address" versus the "use this value". Compare how dword ptr [variable] works exactly the same way as dword ptr[variable][edx] - both time an address is computed and a value is read/written from/to that address.
sharptooth
@sharptooth, I am confused before in sample mov eax, dword ptr[edx], it is clear edx contains the address value, but in sample of mov eax, dword ptr[some_variable], who contains the address value (i.e. address of some_variable)?
George2
some-variable is a beginning of a block somewhere in the program memory. The compiler knows the distance between the beginning of the program memory and start of this block. It's called an offset of variable. In this case the offset is the value which contains the address.
sharptooth
The offset is a compile-time known constant, that's true, but its value is used to access memory. This is indirection.
sharptooth
@sharptooth, you are a real guru! I am too stick in C. I think the key differences for variable in C and in assembly is, when in C we write a variable, its value is represented; and in assembly language, when we write a variable, it is "offset", and could viewed as pointer (address) the variable
George2
(continued), so in assembly language, is not wrapped with [], the address itself will be used. For example, mov eax, some_variable will copy the address value of some_variable, not the variable value itself, correct?
George2
I've tried it in VC7 - it's treated as if "dword ptr" is used when variable is 4 bytes, otherwise it doesn't compile and moans about operand size mismatch which is reasonable if it implies that it should be "dword ptr".
sharptooth
So the answer is "it depends". VC7 thinks that no brackets means "dword ptr", as snemarch mentiones, there are compilers that treat no brackets as offset. That's why brackets are good.
sharptooth
@sharptooth, final confirmation, suppose I have a int variable in assembly whose value is 0x100, the variable name is foo, and I want to assign it to eax as return value, then I should use statement, mov eax, dword ptr[foo], then eax will contain value 0x100?
George2
I've just tried it in VC7 - it works exactly as you say. If you use Visual Studio you can just put a breakpoint and see it.
sharptooth
Are there any way we could access the address of the variable, -- I am confused that we have no way to access the address (i.e. "offset") of the variable, i.e. for example in VC7, no matter whether we wrap [] or not, values will be returned other than address of variable.
George2
Looks like it's a limitation of VS that you can't use 'offset'. But you can use lea instruction.
sharptooth
@sharptooth, question answered. You are powerful man in assembly language! Have a good weekend! :-)
George2
You can't use "offset" for local variables, since their address isn't fixed at compile/assemble-time. You have to use LEA (Load Effective Address). Behind the scenes, locals are referenced as [ESP+xx] or [EBP+xx] :)
snemarch
Completely forgot about it. This is why VS refuses to use "offset" of local variables but completely happy with "offset" of globals.
sharptooth