Can someone please explain what the following x86 assembler instruction does?
call dword ptr ds:[00923030h]
It's an indirect call I suspect but exactly how does it compute the address to call?
Thanks
Marek
Can someone please explain what the following x86 assembler instruction does?
call dword ptr ds:[00923030h]
It's an indirect call I suspect but exactly how does it compute the address to call?
Thanks
Marek
[EDIT] Updated
Whenever you see a memory operand that looks something like ds:0x00923030
, that's a segment-relative addressing mode. The actual address being referred tp is at linear address 0x00923030 relative to the base address of the ds
segment register.
Memory segmentation in the x86 architecture is somewhat confusing, and I think Wikipedia does a good job of explaining it.
Basically, x86 has a number of special segment registers: cs
(code segment), ds
(data segment), es
, fs
, gs
, and ss
(stack segment). Every memory access is associated with a certain segment register. Normally, you don't specify the segment register, and depending on how the memory is accessed, a default segment register is used. For example, the cs
register is used for reading instructions.
Each segment register has a certain base address and a limit. The base address determines the physical address that linear address 0x00000000 corresponds to, and the limit determines the maximum allowable linear address for that segment. For example, if the base address were 0x00040000 and the limit was 0x0000FFFF, then the only valid linear addresses would be 0x00000000 to 0x0000FFFF, and the corresponding physical addresses would be 0x00040000 to 0x0004FFFF.
Thus, the physical address at which the subroutine being called resides is given by the base address stored in the ds
segment register, plus 0x00923030. But we're not done yet -- the instruction has the word ptr
in it. This adds an extra level of indirection, so the actual target of the subroutine is the address stored at the location ds:0x00923030
.
In AT&T syntax (accepted by the GNU assembler), the instruction would be written as follows:
lcall *ds:0x00923030
For the full gory details of what the instruction does, see the 80386 reference manual. This particular variant of the instruction is "CALL r/m16"
(call near register indirect/memory indirect).
IIRC, it takes the value of the DS register (and shifts it left 4 bits), adds to that the immediate value given, fetches a dword value from the resulting memory location, which becomes the address to call. (EDIT: this holds true for 16-bit real mode, for protected mode see the other answers.)
This specific opcode makes a call through the virtual address (32bit here) residing at the location pointed to by the logical address ds:[00923030h]
.
A logical address is made of two components:
Please note that the above calculation denotes a linear address, not a physical one (see intel manuals volume 3a, figure 2.2), which is then translated via the standard mechanism for 4KB paging, i.e. the address consists of an index to page directory, page table and an offset into the selected page. Keep in mind though, that all main stream operating system use the so called flat memory modell, which means that all segment selectors point to address 0x00000000 with the limit set to 0xFFFFFFFF, which is the reason why you can cast between all segments and ultimately leads to (easy) exploitation of buffer overflows.
The assembler instruction you have given, is very likely to be a call through the Import Address Table (see this great article for more details) of an executable file, i.e. it's pretty unlikely that this is an ordinal subroutine call.
Code like this is emitted by compilers because the final virtual address of an imported function from an external dll cannot be known in general at compile time (due to rebasing of dlls). By using such a calling construct, the OS loader can insert the correct virtual address at the address pointer to by the logical address and the compiler doesn't need to care which address the final function has anyway.