ansaurus

Question

Answer 1

+10 A:

[EDIT] Updated

Whenever you see a memory operand that looks something like ds:0x00923030, that's a segment-relative addressing mode. The actual address being referred tp is at linear address 0x00923030 relative to the base address of the ds segment register.

Memory segmentation in the x86 architecture is somewhat confusing, and I think Wikipedia does a good job of explaining it.

Basically, x86 has a number of special segment registers: cs (code segment), ds (data segment), es, fs, gs, and ss (stack segment). Every memory access is associated with a certain segment register. Normally, you don't specify the segment register, and depending on how the memory is accessed, a default segment register is used. For example, the cs register is used for reading instructions.

Each segment register has a certain base address and a limit. The base address determines the physical address that linear address 0x00000000 corresponds to, and the limit determines the maximum allowable linear address for that segment. For example, if the base address were 0x00040000 and the limit was 0x0000FFFF, then the only valid linear addresses would be 0x00000000 to 0x0000FFFF, and the corresponding physical addresses would be 0x00040000 to 0x0004FFFF.

Thus, the physical address at which the subroutine being called resides is given by the base address stored in the ds segment register, plus 0x00923030. But we're not done yet -- the instruction has the word ptr in it. This adds an extra level of indirection, so the actual target of the subroutine is the address stored at the location ds:0x00923030.

In AT&T syntax (accepted by the GNU assembler), the instruction would be written as follows:

lcall *ds:0x00923030

For the full gory details of what the instruction does, see the 80386 reference manual. This particular variant of the instruction is "CALL r/m16" (call near register indirect/memory indirect).

Adam Rosenfield 2009-02-05 22:38:28

Not quite, I think, there's an indirection involved. Thus it should be:The physical address at which the subroutine being called resides is given by the value at the base address stored in the ds segment register plus 0x00923030.

Knut Arne Vedaa 2009-02-05 22:58:50

A segment selector doesn't point to a physical address like you say, but to a linear address. The physical address instead is obtained in the final step, when the logical address has already been resolved to a linear address.

jn_ 2009-02-06 01:28:21

Thanks. It's great that people are willing to share their expertise in a particular area with others. You've saved me a great deal of time.

2009-02-06 14:19:24

Answer 2

+2 A:

IIRC, it takes the value of the DS register (and shifts it left 4 bits), adds to that the immediate value given, fetches a dword value from the resulting memory location, which becomes the address to call. (EDIT: this holds true for 16-bit real mode, for protected mode see the other answers.)

Knut Arne Vedaa 2009-02-05 22:41:34

this is wrong, the offset is not added to the value of the ds register. This is actually a logical address.

jn_ 2009-02-06 12:07:01

You're right, I was thinking about real mode.

Knut Arne Vedaa 2009-02-06 16:07:21

Answer 3

+5 A:

This specific opcode makes a call through the virtual address (32bit here) residing at the location pointed to by the logical address ds:[00923030h].
A logical address is made of two components:

A 16 bit segment selector, ds in this case, which is basically an index into the (global / local) descriptor table managed by the operating system. Such a selector also holds access rights information for the given segment which is checked upon access (current privilege level, CPL)
A 32bit offset
The final address is then computed as follows: base address fetched from selector + offset

Please note that the above calculation denotes a linear address, not a physical one (see intel manuals volume 3a, figure 2.2), which is then translated via the standard mechanism for 4KB paging, i.e. the address consists of an index to page directory, page table and an offset into the selected page. Keep in mind though, that all main stream operating system use the so called flat memory modell, which means that all segment selectors point to address 0x00000000 with the limit set to 0xFFFFFFFF, which is the reason why you can cast between all segments and ultimately leads to (easy) exploitation of buffer overflows.

The assembler instruction you have given, is very likely to be a call through the Import Address Table (see this great article for more details) of an executable file, i.e. it's pretty unlikely that this is an ordinal subroutine call.
Code like this is emitted by compilers because the final virtual address of an imported function from an external dll cannot be known in general at compile time (due to rebasing of dlls). By using such a calling construct, the OS loader can insert the correct virtual address at the address pointer to by the logical address and the compiler doesn't need to care which address the final function has anyway.

jn_ 2009-02-06 01:18:37

Thanks. This is indeed the code generated by a compiler for a virtual function call. Your reply has helped a great deal and saved me much time.

2009-02-06 14:21:41

ansaurus

tags:

views:

answers:

meaning of x86 assembler instruction

related questions