views:

542

answers:

2

Hi all,

Come across what looks at first sight like an MT-issue, but I'm trying to understand in detail the STA model used by COM+.

Effectively, I have a legacy COM+ component, written in VB6, that calls into a native (i.e., not-COM) Win32 DLL written in C++.

Having some intermittant (and impossible to reproduce in testing) problems with it, I added some debugging code to find out what was going on and found that when the problems occur, I had log messages interleaved in the file - so it implied that the DLL was being called by two threads at once.

Now the logging goes to a per-thread file based on _getpid() and GetCurrentThreadId(), so it seems that when the code in the C++ DLL is called, it's getting called twice on the same thread at the same time. My understanding of STA says suggests this could be the case as COM marshalls the individual instances of objects onto a single thread suspends and resumes execution at will.

Unfortuantly I'm not sure where to go from here. I'm reading that I should be calling CoInitialiseEx() in DllMain() to tell COM that this is an STA DLL, but other places say this is only valid for COM DLLs and won't have any effect in a native DLL. The only other option is to wrap parts of the DLL up as critical sections to serialize access (taking whatever performance hit that has on the chin).

I could try and rework the DLL, but there is no shared state or global vars - everything's in local variables so in theory each call should get its own stack, but I'm wondering if the STA model is basically having some odd effect on this and just re-entering into the already loaded DLL at the same entry point as another call. Unfortuantly, I don't know how to prove or test this theory.

The questions basically are:

  1. When an STA COM+ component calls a native DLL, there's nothing in the STA model to prevent the active "thread" being suspended and control being passed over to another "thread" in the middle of a DLL call?
  2. Is CoInitialiseEx() the right way to resolve this, or not?
  3. If neither (1) or (2) are "good" assumptions, what's going on?
+1  A: 

In an apartment threaded COM server, each instance of the COM class is guaranteed to be accesses by a single thread. This means the instance is thread safe. However, many instances can be created simultaniously, using different threads. Now, as far as the COM server is concerned, your native DLL does not have to do anything special. Just think about kernel32.dll, which is used by every executable - does it initialize COM when used by a COM server?

From the DLL point of view, you have to make sure you're thread safe, as different instances can call you at the same time. STA will not protect you on this case. Since you say you are not using any global variables, I can only assume the problem is elsewhere, and just happens to show on circumstances that seem to point to the COM stuff. Are you sure you don't have some plain old C++ memory issues?

eran
The issue appears to occur on production system only at times of load. The code for the DLL has no global variables, using only locals. The output log file, which is uniquely named on thread and process id basis contains interleaved output, suggesting that it's being called twice by the same thread in mid-state.In all other environments, the code executes completely correctly.
THEMike
1. What kind of problems are you actually experiencing? crashes, hangup, misbehavior? 2. Are you creating threads or sending messages, directly or indirectly? 3. Try creating a minidump (or full dump) as soon as you identify a problem, the stack trace might help identifying the problem in the comfort of your dev env.
eran
Re - "plain old C++ memory issues": Yes it was this in the end, but not in the malloc() type arena, but having a char array defined as static. Raymond Chen blogged about it here http://blogs.msdn.com/oldnewthing/archive/2004/03/08/85901.aspx and I've found confirmation on a couple of other places as well sites. Refactoring the code to remove the static seems to be resolved it.
Chris J
A: 

I suspect your problem was that somewhere deep within the called DLL, it made an outbound COM call to another apartment (another thread in the same process, or an object in the MTA, or another process entirely). COM permits an STA thread waiting for the result of an outbound call to receive another inbound call, processing it recursively. It's intended only for ongoing conversations between the same objects - i.e. A calls B, B calls A back, A calls B back again - but can receive calls from other objects if you've handed out an interface pointer to multiple clients, or the client has shared the interface pointer to another client. Generally it's a bad idea to hand out interface pointers to a single-threaded object to multiple client threads, as they'll only have to wait for each other. Create one worker object per thread.

COM cannot suspend and resume execution at will on any thread - a new incoming call on an STA thread can only arrive through the message pump. When 'blocked' waiting for a response, the STA thread is actually pumping messages, checking with the Message Filter (see IMessageFilter) whether the message should be handled. However, message handlers must not make a new outgoing call - if they do COM will return an RPC_E_CANTCALLOUT_INEXTERNALCALL error ("It is illegal to call out while inside message filter.")

Similar problems could occur if you have a message pump (GetMessage/DispatchMessage) anywhere within the native DLL. I've had problems with VB's DoEvents in interface procedures.

CoInitializeEx should only be called by the creator of a thread, because only they know what their message pumping behaviour will be. It's likely that if you try to call it in DllMain it will simply fail, as your native DLL is being called in response to a COM call, so the caller must have ultimately already called CoInitializeEx on the thread in order to make the call. Doing it in the DLL_THREAD_ATTACH notification, for newly-created threads, might work superficially but cause the program to malfunction if COM blocks when it should pump and vice-versa.

Mike Dimmick