views:

799

answers:

9

I have a class, C. C has a member variable declared as: bool markerStart;

From within C a call to sizeof(*this) gives a value of 0x216 bytes.

Elsewhere within C, I do: markerStart = false;

Rather than setting markerStart to false, this call is actually clobbering the start of the next class in memory!

Looking at the disassembled code, I find:

markerStart = false;
06FB6B7F mov            eax, dword ptr [this]
06FB6B78 mov            byte ptr [eax+218h], 0

The second move instruction is setting a byte at this + 0x218 to zero, but since the class is only 0x216 bytes long, this is clobbering memory!

In response to a comment, it definitely is the markerStart = false instruction. I can watch it happening in the disassembler view and in the memory view (and using Windbg, by using a data breakpoint). The first byte of the next class gets set to zero, which messes up its vftbl pointer.

Note: taking the address of markerStart and subtracting it from this, yields 0x211!

Can anyone give me a clue on where to start looking to resolve this problem?

Update: Thanks for all the help. Without code, it was next to impossible for any of you to solve the problem. What I was looking for were hints as to where to start looking. Most of you provided excellent hints, so thank you!

I finally found the problem. In this case alignment had been set in one class, and not been correctly reset following the critical block of code. The class with the faulty alignment happened to get compiled immediately before the declaration of class C - hence that's where the problem showed up.

A: 

The class may have only 0x216 bytes, but the next object is of course 0x218 bytes after the start of your first object. Your objects are apparently aligned at 4 byte memory boundaries, which is the default.

You have to look elsewhere to find out where your memory gets clobbered. It's definitely not the 'markerStart = false' instruction.

Stefan
What about generated code [eax+218]? Unless the objects are aligned on 16 byte boundaries, that would be the next object being overwritten.
Mike
Doesn't make sense anyway. There is no padding between objects.(char*)(a+n) == (char*)(a) + N*sizeof(a), by definition. If sizeof(C) is 0x216, the compiler can place C objects at any even address. It's quite common for the alignment of a class to be the GCD of all member alignments. 2 seems OK.
MSalters
+7  A: 

You need to post more code -- even better would be if you could prune it down to the minimum class definition where your anomaly occurs. That in itself will probably help you identify what's happening.

Some possibilities that occur to me:

  1. You are referencing another markerStart variable which shadows the member variable you are interested in.
  2. You calculate the sizeof in the method of a base class of C. sizeof() only measures the static type, not the dynamic type.
  3. You have broken the One Definition Rule somewhere, and have two different versions of the class C (perhaps through some #ifdef in a header file which is interpreted differently in two translation units).

Without more information, I'd go for the ODR violation. Those can be insidious, and impossible to detect when compiling or linking.

Pontus Gagge
You are right about option 2. I am calculating sizeof(*this) directly in the VS watch window to get 0x216. If I put code in a method of C, sizeof(*this) gives 0x220. This still doesn't explain why [eax+218] is clobbering the next class!Thanks, Rob
Rob
That sounds like the IDE also is confused about what's going on. I'd still take a long hard look at possible ODR violations (option 3) if I were you. Multiple .h files? Direct #ifdefs? Indirect #ifdef dependencies through the declaration of member variables? Other macro trickery?
Pontus Gagge
Similar to ODR violations, I've seen a rare case of make improperly failing to rebuild one of the object files, leading to internally mismatched classes in the final executable.
Mike
Two different values for sizeof (C) suggests inconsistent definitions of class C in two translation units. See my answer below for another possibility.
Stephen C. Steel
A: 

Could this be an instance of the "Class slicing" problem?

Paul Mitchell
+1  A: 

Does your class have virtual methods? or does it derive from a class with virtual methods? Or does your class have multiple inheritance?

The answer depends which compiler you are using, the compiler can store pointer(s) to the virtual table(s). This really is a implementation detail, but as long as it work the same as the standard it can store any kind of data with each object.

Ismael
+1  A: 

Don't you have some weird pointer casting problem in the code? Something similar to this?

struct A
{
  int i;
};

struct B : public A
{
  int j;
  void f() { j=0; }
};

int main()
{
  A x;
  A* p=&x;
  ((B*)p)->f();
  return 0;
}

Can you check this actually points at an instance of C at the line that's clobbering your memory? Can you print typeid(*this).name() at that point (assuming the class has some virtual functions)?

jpalecek
+1  A: 

The problem you are encountering is probably due to some dependancy errors rather than anything wrong with the compiler (compilers are used by hundreds of thousands of developers and if there's a problem like this it would have been found by now).

Consider a simple project with the following two files. File a.cpp:

class C
{
public:
  C () : m_value (42) { }
  void Print () { cout << "C::m_value = " << m_value << endl; }
private:
  int m_value;
};

void DoSomethingWithC (C &c);

void main (void)
{
  C array_of_c [2];
  DoSomethingWithC (array_of_c [0]);
  array_of_c [0].Print ();
  array_of_c [1].Print ();
}

and file b.cpp:

class C
{
public:
  int a,b;
};

void DoSomethingWithC (C &c)
{
  c.b = 666;
}

If you compile the above two files and link them you won't get any errors or warnings. However, when you run the application, you'll find that DoSomethingWithC clobbers array_of_c [1] even though its argument is array_of_c [0].

So your problem could be that one source file sees the class one way and another file sees it a different way. This can happen if the dependancy checking fails.

Try forcing a rebuild all. If that works then you'll need to see why the dependcies have failed (DevStudio, for example, can get it wrong sometimes).

Skizz

Skizz
+5  A: 

As pointed out by Pontus, you probably broken the one definition rule somehow. You've probably included the header file with the definition of class C in two translation units where other code, often in a preceding header file, changes the way the definition of class C is interpreted so that it has a different size in the two translation units.

One possibility, is that you've accidently changed the default member alignment (either through a command line argument to the compiler or via a #pragma in the source) so that two different translation units think the structure has a different size because of differing amounts of padding (most x86 compilers default to alignment on 4-byte boundaries, but allow you to request alignment on 1-byte boundaries as needed). Look in the other header files for a #pragma changing the default alignment that is missing a following #prama to restore it back to the previous value (you don't specify what compiler, so I can't give specifics).

Stephen C. Steel
Nice one Stephen. I had spotted the problem before reading your answer, but you hit the nail on the head. Cheers!
Rob
+1  A: 

Have your class definitions or project settings changed at all recently? I've seen problems like this, where I was getting absolutely absurd behaviors like this, and the cause was stale object files being linked. The object files no longer matched the source files. Try a clean and complete rebuild.

Rob K
A: 

Check that *this really points to the start of the class rather than a call via a bad pointer.

Joshua