views:

151

answers:

3

Hi

I have following basic questions :

  • When we should involve disassembly in debugging

  • How to interpret disassembly, For example below what does each segment stands for

00637CE3 8B 55 08             mov         edx,dword ptr [arItem]
00637CE6 52                   push        edx
00637CE7 6A 00                push        0
00637CE9 8B 45 EC             mov         eax,dword ptr [result]
00637CEC 50                   push        eax
00637CED E8 3E E3 FF FF       call        getRequiredFields (00636030)
00637CF2 83 C4 0C             add

Language : C++

Platform : Windows

+3  A: 

It's quite useful to estimate how efficient is the code emitted by the compiler.

For example, if you use an std::vector::operator[] in a loop without disassembly it's quite hard to guess that each call to operator[] in fact requires two memory accesses but using an iterator for the same would require one memory access.

In your example:

mov         edx,dword ptr [arItem] // value stored at address "arItem" is loaded onto the register
push        edx // that register is pushes into stack
push        0 // zero is pushed into stack
mov         eax,dword ptr [result] // value stored at "result" address us loaded onto the register
push        eax // that register is pushed into stack
call        getRequiredFields (00636030) // getRequiredFields function is called

this is a typical sequence for calling a function - paramaters are pushed into stack and then the control is transferred to that function code (call instruction).

Also using disassembly is quite useful when participating in arguments about "how it works after compilation" - like caf points in his answer to this question.

sharptooth
Why would `vector<T>::operator[]` in a loop require two memory accesses? Load the base T* in a register, and use an single indexed memory access.
MSalters
If you call operator[] in a loop the compiler can have hard time seeing that the buffer start is unchanged and not reloading the base T* on each iteration.
sharptooth
+1  A: 

When you should involve disassembly: When you exactly want to know what the CPU is doing when it's executing your program, or when you don't have the source code in whatever higher level language the program was written in (C++ in your case).

How to interpret assembly code: Learn assembly language. You can find an exhaustive reference on Intel x86 CPU instructions in Intel's processor manuals.

The piece of code that you posted prepares arguments for a function call (by getting and pushing some values on the stack and putting a value in the register eax), and then calls the function getRequiredFields.

Jesper
+3  A: 

1 - We should (I) involve disassembly in debugging as a last resort. Generally, an optimizing compiler generates code that is not trivial to understand to the human eye. Instructions are reordered, some dead code is eliminated, some specific code is inlined, etc, etc. So it is not necessary and not easy when necessary to understand disassembled code. For example, I sometimes look at the disassembly to see if constants are part of the opcode or are stored in const variables.

2 - That piece of code calls a function like getRequiredFields(result, 0, arItem). You have to learn assembly language for the processor you want. For x86, go to www.intel.com and get the manuals of the IA32.

Malkocoglu
When debugging optimized code is *exactly* when you want to use disassembly. The execution of the source code will not make any sense exactly because of all the inlining and reordering. The only code that will make sense is the disassembly.
Zan Lynx
@Zan Lynx: At second thought, you are correct. My point was, the disassembly listing would be very different from the "seen" C++ sourcecode/codeflow, so it would rather confuse him than help.
Malkocoglu