views:

75

answers:

2

I have a small VC++ application, in two pieces. The first piece contains the main functionality and is compiled as a static library. The second piece is a windows service that links to the library from piece one.

I'm seeing some odd behavior caused by memory corruption. By setting data breakpoints and the like, I was able to determine that a static variable in the service piece is being corrupted every time certain members from one of the library objects are written two. Conversely, members of the library object are corrupted when locations pointed to by the static var are written. Could there be object overlap?

EDIT: I forgot to mention that the instance of BlahHelper is at global scope in my service code. This supports the overlap theory, since BlahHelper and ServiceBase::m_service should both be in the global data area of the exe.

EDIT2: By looking at the raw memory and checking the addresses of all the relevant objects, I've confirmed that the BlahHelper object overlaps the ServiceBase::m_service pointer. Why might this be the case?

Here are the class definitions of interest:

// This is the basis of my service.  I derive from this and override 
// the start() and  stop() methods to implement the service.
class ServiceBase
{
public:

    virtual ~ServiceBase();

    static void Run(ServiceBase& service);

protected:

    ServiceBase(DWORD controlsAccepted = SERVICE_ACCEPT_PAUSE_CONTINUE | 
                                         SERVICE_ACCEPT_STOP | 
                                         SERVICE_ACCEPT_SHUTDOWN);

    virtual void Start(DWORD control) = 0;
    virtual void Stop(DWORD control) = 0;

    void UpdateState(DWORD state,
                     HRESULT errorCode = S_OK);

    const std::wstring& ServiceName() const;

private:

    void SetServiceStatus();

    static void WINAPI ServiceMain(DWORD argumentCount,
                                   PWSTR* arguments);

    static void WINAPI Handler(DWORD control);

    static ServiceBase* m_service;  // This is being corrupted
    SERVICE_STATUS_HANDLE m_handle;
    ServiceStatus m_status;
    std::wstring m_serviceName;

};

This is the one of the classes in the library. When I link the library into my service exe and instantiate a BlahHelper object, I see some weird issues with memory corruption.

// Writing to _blah2Open or _blah1Open causes corruption of ServiceBase::m_status
class BlahHelper
{
    // Names changed to protect the innocent
public:
    BlahHelper();
    ~BlahHelper();

    HRESULT GetSomeInfo();
    HRESULT GetSomeStatus(LPWORD statPosition);

    void Init(char blah1Sp[], char blah2Sp[], HWND messageWindow);
    bool Blah1ConnectionOpen(){return _blah2Open;};
    bool Blah2ConnectionOpen(){return _blah1Open;};
    hash_map<string,short> GetSomeJunk(){return _someJunk;};
    void Refreshblah1Config();
    bool HasItemsTakenSensor(){return _blah1HasItemsTakenSensor;};
    void Enterblah2();
    void blah2Exited();
    void Ackblah2ExitReq();
    void Cleanup();
    void Initblah1();
    void Initblah2();

private:

    LPWFSRESULT OpenSession(char* spName, HSERVICE* handle);
    LPWFSRESULT Getblah1Caps();
    void Cleanupblah1();
    void Cleanupblah2();
    void Closeblah1();
    void Closeblah2();
    void Openblah1();
    void Openblah2();
    void Registerblah1();
    void Registerblah2();
    void Checkblah1Caps();
    void CheckSomeJunk();
    void Getblah1Config();
    void LogMessage(string message, int logLevel);

    char* _SpName1;
    char* _SpName2;           
    HWND _messageWindow;      
    HSERVICE _Handle1;      
    HSERVICE _Handle2;      
    bool _blah2Open;              // writing to this causes corruption of ServiceBase::m_service
    bool _blah1Open;              // writing to this causes corruption of ServiceBase::m_service
    const string _logSource;
    const int _logMsgId;
    bool _blah1HasItemsTakenSensor;

    hash_map<string, short> _someJunk;
};

As I said, a data breakpoint revealed that writing _blah1Open or _blah2Open corrupts ServiceBase::m_service. As further confirmation, I commented out every line of BlahHelper's implementation that wrote to these values, and the corruption disappeared.

If I change the order of declaration of BlahHelper's members, I still see memory corruption issues, but the symptoms change.

If I directly include the library code in the service, I don't see the issue anymore. I'm not able to do this for other than diagnostic purposes, but it does indicate that something weird is happening in the linking process.

One other thing to note is that the library is compiled with a Muli-Byte character set, while the service application that links the library is compiled with Unicode. This will be difficult to change.

Can anyone suggest possible reasons why this might be happening, or approaches to diagnose the issue? When I realized I had memory corruption, I was hoping for a simple cause (like buffer overflow). But, I have no idea why one object might step on another like this.

A: 

Do you mean that m_service itself changes or *m_service changes?

If you examine &_blah2Open and m_service do the addresses match?

(Remember that _blah2Open is a variable and m_service is a pointer, so you want the address of _blah2Open and the value of m_service)

The address of _blah2Open is bound to be on the heap or stack, allocated as part of CBlahHelper, so it's quite unlikely that that address points to the contents of m_service.

If it's a case of contents pointed at by m_service changing, One possible scenario is : m_service is initialized somewhere in the code, but for some reason static initializers are not being called in your case, so m_service points somewhere randomly and the contents get overwritten, when some object gets allocated there. You need to put a data breakpoint at the very start of the program run and track when m_service changes each time.

On the other hand if you say m_service and _blah1Open have the same address at the point of failure : Two distinct variables (not within a union) always have distinct addresses, so there is no logical scenario for both m_service and _blah2Open sharing the same address in memory. That means a fundamental failure in the compilers generated code which is highly unlikely....

rep_movsd
Odrade
A: 

Since the two pieces are compiled with different options, did you check that sizeof(BlahHelper) is the same from the perspective of each module? It's possible that the compilation options are causing the structure layout to look different thus allowing for the memory overlap. Otherwise I can't see any way for the member of BlahHelper to overlay your statc pointer.

Mark B
I was starting to suspect that this might be the case. I'll check into it.
Odrade
The sizes are the same from both modules, as far as I can tell. I may have done the determination incorrectly.
Odrade
I debugged into the service, then executed sizeof(XfsHelper) from the immediate window. I changed the .lib to compile as .exe, then did the same from a main function there. Then, as an extra attempt, I addded a GetBlahHelperSize method to the .lib and called it from the service. All gave the same result.
Odrade