tags:

views:

172

answers:

5

This is driving me nuts. I am using some 3rd-party code in a Windows .lib that, in debug mode, is causing an error similar to the following:

Run-Time Check Failure #2 - Stack around the variable 'foo' was corrupted.

The error is thrown when either the object goes out of scope or is deleted. Simply allocating one of these objects and then deleting it will throw the error. I therefore think the problem is either in one of the many constructors/destructors but despite stepping through every line of code I cannot find the problem.

However, this only happens when creating one of these objects in a static library. If I create one in my EXE application, the error does not appear. The 3rd-party code itself lives in a static lib. For example, this fails:

**3RDPARTY.LIB**

class Foo : public Base
{
    ...
};

**MY.LIB**

void Test()
{
    Foo* foo = new Foo;
    delete foo; // CRASH!
}

**MY.EXE**

void Func()
{
    Test();
}

But this will work:

**3RDPARTY.LIB**

class Foo : public Base
{
    ...
};

**MY.EXE**

void Func()
{
    Foo* foo = new Foo;
    delete foo; // NO ERROR
}

So, cutting out the 'middle' .lib file makes the problem go away and it is this weridness that is driving me mad. The EXE and 2 libs all use the same CRT library. There are no errors linking. The 3rd-party code uses inheritance and there are 5 base classes. I've commented out as much code as I can whilst still getting it to build and I just can't see what's up.

So if anyone knows why code in a .lib would act differently to the same code in a .exe, I would love to hear it. Ditto any tips for tracking down memory overwrites! I am using Visual Studio 2008.

A: 

Is your .lib file linked against the library's .lib? I assume from your example that you are including the header with the declaration of the destructor; without it, deleting such a type is allowed but can result in UB (in a bizarre manner contrary to the general rule that something must be defined before used). If the .lib files aren't linked together, it's possible that a custom operator delete or destructor is having some weird linking issues, and while that shouldn't happen, you never can quite tell if it won't.

coppro
+2  A: 

One possibility is that it's a calling convention mismatch - make sure that your libraries and executables are all set to use the same default calling convention (usually __cdecl). To set that, open up your project properties and go to Configuration Properties > C/C++ > Advanced and look at the Calling Convention option. If you call a function with the wrong calling convention, you'll completely mess up the stack.

Adam Rosenfield
All the projects are using __cdecl. The crash only occurs when creating a particular type of object - other objects work OK.
Rob
Well, that was just a guess - it's hard to psychically debug someone else's problem without more information.
Adam Rosenfield
Thanks for the pointer Adam. This problem is driving me nuts. I am close to abandoning the 3rd party code and trying something else.
Rob
A: 

Without seeing more code, it's hard to give you a firm answer. However, for tracking down memory overwrites, I recommend using WinDbg (free from Microsoft, search for "Debugging Tools for Windows").

When you have it attached to your process, you can have it set breakpoints for memory access (read, write, or execute). It's really powerful overall, but it should especially help you with this.

Brian
A: 

The error is thrown when either the object goes out of scope or is deleted.

Whenever I've run into this it had to do with the compiled library using a different version of the C++ runtime than the rest of the application.

Max Lybbert
+2  A: 

OK, I tracked the problem down and it's a cracker, if anyone's interested. Basically, my .LIB, which exhibited the problem. had defined _WIN32_WINNT as 0x0501 (Windows 2000 and greater), but my EXE and the 3rd-party LIB had it defined as 0x0600 (Vista). Now, one of the headers included by the 3rd-party lib is sspi.h which defines a structure called SecurityFunctionTable which includes the following snippet:

#if OSVER(NTDDI_VERSION) > NTDDI_WIN2K
    // Fields below this are available in OSes after w2k
    SET_CONTEXT_ATTRIBUTES_FN_W         SetContextAttributesW;
#endif // greater thean 2K

Th cut a long story short, this meant a mismatch in object sizes between the LIBs and this was causing the Run-Time Check Failure.

Class!

Rob