views:

54

answers:

3

Hello,

This guy:

virtual phTreeClass* GetTreeClass() const { return (phTreeClass*)m_entity_class; }

When called, crashed the program with an access violation, even after a full recompile. All member functions and virtual member functions had correct memory adresses (I hovered mouse over the methods in debug mode), but this function had a bad memory adress: 0xfffffffc.

Everything looked okay: the 'this' pointer, and everything works fine up until this function call. This function is also pretty old and I didn't change it for a long time. The problem just suddenly popped up after some work, which I commented all out to see what was doing it, without any success.

So I removed the virtual, compiled, and it works fine. I add virtual, compiled, and it still works fine! I basically changed nothing, and remember that I did do a full recompile earlier, and still had the error back then.

I wasn't able to reproduce the problem. But now it is back. I didn't change anything. Removing virtual fixes the problem.

Sincerely,

Antoon

A: 

Compilers and linkers are pieces of software written by human like any other, and thus inherently cannot be error-free..

We occasionally run into such inexplicable issues and fixes too. There's a myth going around here that deleting the ncb file once fixed a build..

Ofek Shilon
True, but blaming the compiler and/or linker almost always turns out to be wrong. :)
Troubadour
@Ofek It's actually more likely that rebuilding caused a previously un-rebuilt file to get recompiled, thus fixing the problem. A compiler that behaves incorrectly all the time is unlikely, but one that works sometimes and fails sometimes is even less likely.
Mark B
You're probably right. I'm just frustrated at the moment because I *am* experiencing positive linker issues at the moment (incredibuild/local builds compatibility, don't ask)
Ofek Shilon
+1  A: 

Don't ever use C-style casts with polymorphic types unless you're seriously sure of what you're doing. The overwhelming probability is that you cast it to a type that it wasn't. If your pointers don't implicitly cast (because they cast to a base class, which is safe) then you're doing it wrong.

DeadMG
even then don't use them one of the C++ casts will work and using that will show what you mean
Mark
Thanks, I was still working on getting C++ style casts in there, and now changed it inside these methods. I only use base to child casting in this case because I do not want to waste memory on saving a pointer for every child class.
Xilliah
Your bug likely exists because it isn't the child class. If you have a child*, store it as a child*. If you have a base*, don't upcast it to a child*.
DeadMG
A: 

Given that recompiling originally fixed the problem, try doing a full clean and rebuild first.

If that fails, then it looks extremely likely that even though your this pointer appears correct to you, it is in fact deleted/deconstructed and pointed at garbage memory that just happens to look like the real object that was there before. If you're using gdb to debug, the first word at the object's pointer will be the vtable. If you do an x/16xw <addr> (for example) memory dump at that location gdb will tell you what sort of object's vtable resides there. If it's the parent-most type then the object is definitely gone.

Alternately if the this pointer isthe same every time you can put a breakpoint in the class destructor with the condition that this == known_addr.

Mark B
Thanks, esp for the last tip. I turned off memory freeing and it was still happening..
Xilliah