views:

187

answers:

5

A colleague of mine had a problem with some C++ code today. He was debugging the weird behaviour of an object's virtual method. Whenever the method executed ( under debug, Visual Studio 2005 ), everything went wrong, and the debugger wouldn't step in that method, but in the object's destructor! Also, the virtual table of the object, only listed it's destructor, no other methods.

I haven't seen this behaviour before, and a runtime error was printed, saying something about the ESP register. I wish I could give you the right error message, but I don't remember it correctly now.

Anyway, have any of you guys ever encountered that? What could cause such behaviour? How would that be fixed? We tried to rebuild the project many times, restarted the IDE, nothing helped. We also used the _CrtCheckMemory function before that method call to make sure the memory was in a good state, and it returned true ( which means ok ) . I have no more ideas. Do you?

+2  A: 

I've seen that before. Generally it occurs because I'm using a class from a Release built .LIB file while I'm in Debug mode. Someone else probably has seen a better example and I'd yield my answer to their answer.

wheaties
I'm not sure this is the case. We rebuilt the whole thing, in debug mode.
Geo
Reboot your machine and do a clean rebuild?
wheaties
I'll try that in the morning.
Geo
It didn't help. We made sure nothing from a configuration was used in another. Still nothing.
Geo
I'd look at what others have suggested. Particularly Greg's answer.
wheaties
+1  A: 

Maybe you use C-style casts where a static_cast<> is required? This may result in the kind of error you report, whenever multiple inheritance is involved, e.g.:

class Base1 {};
class Base2 {};
class Derived : public Base1, public Base2 {};

Derived *d = new Derived;
Base2* b2_1 = (Base2*)d; // wrong!
Base2* b2_2 = static_cast<Base2*>(d); // correct
assert( b2_1 == b2_2 ); // assertion may fail, because b2_1 != b2_2

Note, this may not always be the case, this depends on the compiler and on declarations of all the classes involved (it probably happens when all classes have virtual methods, but I do not have exact rules at hand).

OR: A completely different part of your code is going wild and is overwriting memory. Try to isolate the error and check if it still occurs. CrtCheckMemory will find only a few cases where you overwrite memory (e.g. when you write into specially marked heap management locations).

frunsi
Well, I agree using `static_cast` over C-style casts is preferred, however there's nothing inherently wrong with the C-style casts in your code - it will handle the inheritance just fine.
sbk
No, the C-style cast is wrong, here is a more detailed explanation: http://en.wikipedia.org/wiki/Virtual_method_table#Multiple_inheritance_and_thunks - interestingly, but incidentally, the example looks similar to mine =)
frunsi
Well, at least the VS2008 compiler did not fix the pointer. Though another wikipedia article suggests that a C-style cast should do this: http://en.wikipedia.org/wiki/Thunk#Thunks_in_object-oriented_programming - I am confused, either Wikipedia is wrong, or it is a bug the VS2008 compiler.
frunsi
+1  A: 

If you ever call a function with the wrong number of parameters, this can easily end up trashing your stack and producing undefined behaviour. I seem to recall that certain errors when using MFC could easily cause this, for example if you use the dispatch macros to point a message at a method that doesn't have the right number or type of parameters (I seem to recall that those function pointers aren't strongly checked for type). It's been probably a decade since I last encountered that particular problem, so my memory is hazy.

Greg Hewgill
+1 for mentioning MFC's odd behavior when parameter counts don't match.
Mark Ransom
+1  A: 

The value of ESP was not properly saved across a function call.

This sort of behaviour is usually indicative of the calling code having been compiled with a different definition of a class or function than the code that created the particular class in question.

Is it possible that there is an different version of a component dll that is being loaded instead of the freshly built one? This can happen if you copy things as part of a post-build step or if the process is run from a different directory or changes its dll search path before doing a LoadLibrary or equivalent.

I've encountered it most often in complex projects where a class definition is changed to add, remove or change the signature of a virtual function and then an incremental build is done and not all the code that needs to be recompiled is actually recompiled. Theoretically, it could happen if some part of the program is overwriting the vptr or vtables of some polymorhpic objects but I've always found that a bad partial build is a much more likely cause.

This may 'user error', a developer deliberately tells the compiler to only build one project when others should be rebuilt, or it can be having multiple solutions or multiple projects in a solution where the dependencies are not correctly setup.

Very occasionally, Visual Studio can slip up and not get the generated dependencies correct even when the projects in a solution are correctly linked. This happens less often than Visual Studios is blamed for it.

Expunging all intermediate build files and rebuilding everything from source usually fixes the problem. Obviously for very large projects this can be a severe penalty.

Charles Bailey
+1  A: 

Since it's guess-fest anyway, here's one from me:

You stack is messed up and _CrtCheckMemory doesn't check for that. As to why the stack is corrupted:

  • good old stack overflow
  • calling convention mismatches, which was already mentioned (I don't know, like passing a callback in the wrong calling convention to a WinAPI function; what static or dynamic libraries are you linking with?)
  • a line like printf("%d");
sbk