views:

230

answers:

4

I've been messing around with the free Digital Mars Compiler at work (naughty I know), and created some code to inspect compiled functions and look at the byte code for learning purposes, seeing if I can learn anything valuable from how the compiler builds its functions. However, recreating the same method in MSVC++ has failed miserably and the results I am getting are quite confusing. I have a function like this:

unsigned int __stdcall test()
{
  return 42;
}

Then later I do:

unsigned char* testCode = (unsigned char*)test;

I can't seem to get the C++ static_cast to work in this case (it throws a compiler error)... hence the C-style cast, but that's besides the point... I've also tried using the reference &test, but that helps none.

Now, when I examine the contents of the memory pointed to by testCode I am confused because what I see doesn't even look like valid code, and even has a debug breakpoint stuck in there... it looks like this (target is IA-32):

0xe9, 0xbc, 0x18, 0x00, 0x00, 0xcc...

This is clearly wrong, 0xe9 is a relative jump instruction, and looking 0xbc bytes away it looks like this:

0xcc, 0xcc, 0xcc...

i.e. memory initialised to the debug breakpoint opcode as expected for unallocated or unused memory.

Where as what I would expect from a function returning 42 would be something like:

0x8b, 0x2a, 0x00, 0x00, 0x00, 0xc3

or at least some flavour of mov followed by a ret (0xc2, 0xc3, 0xca or 0xcb)a little further down

Is MSVC++ taking steps to prevent me from doing this sort of thing for security reasons, or am I doing something stupid and not realising it? This method seems to work fine using DMC as the compiler...

I'm also having trouble going the other way (executing bytes), but I suspect that the underlying cause is the same.

Any help or tips would be greatly appreciated.

+1  A: 

If you want to look at assembly and machine code for a given compiled function, it'll be easier to supply the /FAcs command line option to the compiler and look at the ensuing .asm file.

I'm not sure what the defined behavior is for casting a function pointer to a byte-stream -- it may not even work properly -- but one possible source of additional confusion is that x86 functions are all variable sizes and little-endian too.

Crashworks
+1  A: 

If this is with incremental linking turned on, then what you're seeing is a jmp [destination]. You can run the debugger and see what the disassembly is to verify as well.

MSN
+2  A: 

I can only guess, but I'm pretty sure you are inspecting a debug build. In debug mode the MSVC++ compiler replaces all calls by calls to jump stubs. This means, that every function starts with a jump to the real function and this is exactly what you are facing here.
The surrounding 0xCC bytes are indeed breakpoint instructions, in order to fire a possibly attached debugger in case you're executing code where you shouldn't.
Try the same with a release build. That should work as expected.

Edit: This is actually affected by the linker setting /INCREMENTAL. The reason that the effect you're describing doesn't show up in release builds is that these jump stubs are simply optimized away if any kind of optimization is turned on (which is of course usually the case for release builds).

jn_
+2  A: 

For your cast you want:

unsigned char* testCode = reinterpret_cast<unsigned char*>( test );

Switch Debug Information Format from 'Program Database for Edit & Continue (/ZI)' to 'Program Database (/Zi)' in Project -> Properties -> C/C++ -> General. I believe it's that setting which causes the compiler to insert jump code so the debugger can rebuild a function and hot patch it in while the program is running. Probably turn off 'Enable Minimal Rebuild' also.

A much simpler way of inspecting the code in MSVC is to simply set a break point and inspect the disassembly (right click on the line and select 'Goto disassembly' from the pop-up menu. It annotates the disassembly with the source code so you can see what each line is compiled to.

Rob K
thanks for the useful information, this fixed the problem with the C++ casting... i wrongly assumed reinterpret was only for working around issues with inheritance.i am aware of the "goto disassembly" but i am looking for something at runtime.thanks very much though. good answer. :)
jheriko