views:

213

answers:

4

I apparently have an dead lock problem in one of my applications and started investigating EurekaLog stack traces. Here's one recent:

Call Stack Information:
--------------------------------------------------------------------------------------------------------------------------------------
|Address |Module        |Unit                        |Class                               |Procedure/Method               |Line      |
--------------------------------------------------------------------------------------------------------------------------------------
|*Exception Thread: ID=14208; Priority=0; Class=; [Main]                                                                             |
|------------------------------------------------------------------------------------------------------------------------------------|
|7C82860C|ntdll.dll     |                            |                                    |KiFastSystemCall               |          |
|7C827D27|ntdll.dll     |                            |                                    |ZwWaitForSingleObject          |          |
|77E61C96|kernel32.dll  |                            |                                    |WaitForSingleObjectEx          |          |
|77E61C88|kernel32.dll  |                            |                                    |WaitForSingleObject            |          |
|77E61C7B|kernel32.dll  |                            |                                    |WaitForSingleObject            |          |
|004151C4|MyApp.exe     |sysutils.pas                |TMultiReadExclusiveWriteSynchronizer|WaitForWriteSignal             |16740[1]  |
|004151BC|MyApp.exe     |sysutils.pas                |TMultiReadExclusiveWriteSynchronizer|WaitForWriteSignal             |16740[1]  |
|0041522C|MyApp.exe     |sysutils.pas                |TMultiReadExclusiveWriteSynchronizer|BeginWrite                     |16818[57] |
|004323FB|MyApp.exe     |Classes.pas                 |TDataModule                         |Create                         |11357[1]  |
|004323C0|MyApp.exe     |Classes.pas                 |TDataModule                         |Create                         |11356[0]  |
|007D744D|MyApp.exe     |uRORemoteDataModule.pas     |TRORemoteDataModule                 |Create                         |163[1]    |
|007D7434|MyApp.exe     |uRORemoteDataModule.pas     |TRORemoteDataModule                 |Create                         |162[0]    |
|007DBFAB|MyApp.exe     |Sentrol_Impl.pas            |                                    |Create_Sentrol                 |85[1]     |
|00646952|MyApp.exe     |uROServer.pas               |TROInvoker                          |CustomHandleMessage            |726[11]   |
|00407BFA|MyApp.exe     |system.pas                  |TInterfacedObject                   |_AddRef                        |17972[1]  |
|00404934|MyApp.exe     |system.pas                  |TObject                             |GetInterface                   |9003[8]   |
|00407B1C|MyApp.exe     |system.pas                  |                                    |_IntfClear                     |17817[1]  |
|00404966|MyApp.exe     |system.pas                  |TObject                             |GetInterface                   |9009[14]  |
|004048E8|MyApp.exe     |system.pas                  |TObject                             |GetInterface                   |8995[0]   |
|00407BD7|MyApp.exe     |system.pas                  |TInterfacedObject                   |QueryInterface                 |17964[1]  |
|77E61680|kernel32.dll  |                            |                                    |InterlockedDecrement           |          |
|00407C10|MyApp.exe     |system.pas                  |TInterfacedObject                   |_Release                       |17977[1]  |
|00407B2C|MyApp.exe     |system.pas                  |                                    |_IntfClear                     |17824[8]  |
|004067DF|MyApp.exe     |system.pas                  |                                    |_FinalizeArray                 |15233[100]|
|00407B1C|MyApp.exe     |system.pas                  |                                    |_IntfClear                     |17817[1]  |
|00646577|MyApp.exe     |uROServer.pas               |TROClassFactoryList                 |FindClassFactoryByInterfaceName|619[17]   |
|77E6166C|kernel32.dll  |                            |                                    |InterlockedIncrement           |          |
|00407BFA|MyApp.exe     |system.pas                  |TInterfacedObject                   |_AddRef                        |17972[1]  |
|00646B72|MyApp.exe     |uROServer.pas               |TROInvoker                          |HandleMessage                  |758[1]    |
|006460C5|MyApp.exe     |uROServer.pas               |                                    |MainProcessMessage             |512[98]   |
|00645BAC|MyApp.exe     |uROServer.pas               |                                    |MainProcessMessage             |414[0]    |
|00647184|MyApp.exe     |uROServer.pas               |TROMessageDispatcher                |ProcessMessage                 |929[2]    |
|00647130|MyApp.exe     |uROServer.pas               |TROMessageDispatcher                |ProcessMessage                 |927[0]    |
|00647BCF|MyApp.exe     |uROServer.pas               |TROServer                           |IntDispatchMessage             |1328[27]  |
|00647ABC|MyApp.exe     |uROServer.pas               |TROServer                           |IntDispatchMessage             |1301[0]   |
|0064782F|MyApp.exe     |uROServer.pas               |TROServer                           |DispatchMessage                |1170[11]  |
|006477B4|MyApp.exe     |uROServer.pas               |TROServer                           |DispatchMessage                |1159[0]   |
|006477A9|MyApp.exe     |uROServer.pas               |TROServer                           |DispatchMessage                |1152[1]   |
|0064779C|MyApp.exe     |uROServer.pas               |TROServer                           |DispatchMessage                |1151[0]   |
|00659CB6|MyApp.exe     |uROLocalServer.pas          |TROLocalServer                      |SendRequest                    |57[1]     |
|00659CA4|MyApp.exe     |uROLocalServer.pas          |TROLocalServer                      |SendRequest                    |56[0]     |
|0065A009|MyApp.exe     |uROLocalChannel.pas         |TROLocalChannel                     |IntDispatch                    |99[10]    |
|005EE540|MyApp.exe     |uROClient.pas               |TROTransportChannel                 |Dispatch                       |1884[36]  |
|005EE3FC|MyApp.exe     |uROClient.pas               |TROTransportChannel                 |Dispatch                       |1848[0]   |
|005EEC8F|MyApp.exe     |uROClient.pas               |TROTransportChannel                 |Dispatch                       |2134[27]  |
|00616EC8|MyApp.exe     |PCCS_Intf.pas               |TSentrol_Proxy                      |GetNewValues                   |6585[7]   |
|007CBDB9|MyApp.exe     |ETAROConnectionForm.pas     |TROConnectionForm                   |SyncSentrolUpdateTimerTimer    |855[16]   |
|7C82ABE5|ntdll.dll     |                            |                                    |RtlTimeToTimeFields            |          |
|004D5D9C|MyApp.exe     |Controls.pas                |TControl                            |WndProc                        |5063[0]   |
|004DA05B|MyApp.exe     |Controls.pas                |TWinControl                         |WndProc                        |7304[111] |
|7C81A3AB|ntdll.dll     |                            |                                    |RtlLeaveCriticalSection        |          |
|0042659C|MyApp.exe     |Classes.pas                 |TThreadList                         |UnlockList                     |3359[1]   |
|00426598|MyApp.exe     |Classes.pas                 |TThreadList                         |UnlockList                     |3359[1]   |
|004935BC|MyApp.exe     |Graphics.pas                |                                    |FreeMemoryContexts             |5060[12]  |
|00493524|MyApp.exe     |Graphics.pas                |                                    |FreeMemoryContexts             |5048[0]   |
|004D9799|MyApp.exe     |Controls.pas                |TWinControl                         |MainWndProc                    |7076[6]   |
|004329F4|MyApp.exe     |Classes.pas                 |                                    |StdWndProc                     |11583[8]  |
|7739C09A|USER32.dll    |                            |                                    |CallNextHookEx                 |          |
|004B1343|MyApp.exe     |ExtCtrls.pas                |TTimer                              |Timer                          |2281[1]   |
|00404A30|MyApp.exe     |system.pas                  |                                    |_CallDynaInst                  |9159[1]   |
|004B1227|MyApp.exe     |ExtCtrls.pas                |TTimer                              |WndProc                        |2239[4]   |
|004329F4|MyApp.exe     |Classes.pas                 |                                    |StdWndProc                     |11583[8]  |
|7739C42C|USER32.dll    |                            |                                    |GetParent                      |          |
|7739C45C|USER32.dll    |                            |                                    |GetParent                      |          |
|773A16E0|USER32.dll    |                            |                                    |DispatchMessageA               |          |
|773A16D6|USER32.dll    |                            |                                    |DispatchMessageA               |          |
|004CC234|MyApp.exe     |Forms.pas                   |TApplication                        |ProcessMessage                 |8105[23]  |
|004CC138|MyApp.exe     |Forms.pas                   |TApplication                        |ProcessMessage                 |8082[0]   |
|004CC26E|MyApp.exe     |Forms.pas                   |TApplication                        |HandleMessage                  |8124[1]   |
|004CC264|MyApp.exe     |Forms.pas                   |TApplication                        |HandleMessage                  |8123[0]   |
|004CC563|MyApp.exe     |Forms.pas                   |TApplication                        |Run                            |8223[20]  |
|004CC4B0|MyApp.exe     |Forms.pas                   |TApplication                        |Run                            |8203[0]   |
|007F18B3|MyApp.exe     |MyApp.dpr                   |                                    |                               |215[65]   |

The stack trace seems to be ok until first TTimer call, after that it contains some garbage(?), however the end contains the the lock which seems be holding the main thread.

Can I trust this stack trace? If not, what can cause this and how I can avoid it?

Any ideas about the dead lock based on this stack trace? I don't quite understand how creating a datamodule can dead lock..

I'm using Delphi 2007.

Edit: I found one case where DbExpress connection was shared between main thread and thread created by RemObjects. After fixing that and some minor things the application has been running now over 12 hours without problems. I'll wait a day or two to see if the problem has really disappeared.

+1  A: 

EurekaLog as far as I know should be relaible. Stack trace could be inccorect if the information containing the function<->address mapping is not of the same version as executable.

I use JCL for stack trace logging and with it the map/jdbg files. If you change the executable and leave the same map file your stack log will be incorrect as the function addresses will be different. So maybe the only reason could be this. You also must distinguish between a true stack trace of the last function call and a "call history". I use a call history which also list all the functions that were called in the time it was recorded. Like the TTimer in you case.

A typical stack trace only contains the function calls that lead to the last function. The "call history" on the other hand contains all the function being called in the certain time frame. And if you have secondary threads besided main thread (like TTimer) it will record their actions also. I forgot how this is called in the JCL, but it is the default behaviour if you trace in stRawMode.

Lastly I can see from stack trace that WaitForSingleObject is called, preferably with INFINITE parameter so it never times out. And then the condition for WaitForSingleObject to become signaled is never fullfiled.

Check the "TRORemoteDataModule" and what it does in the constructor. Because as it seems it only inherits from TDataModule and ads some of its own logic. This should be the cause for deadlock.

EDIT:

Ok I checked the TDataModule code:

constructor TDataModule.Create(AOwner: TComponent);
begin
  GlobalNameSpace.BeginWrite;
  try
    CreateNew(AOwner);
    if (ClassType <> TDataModule) and not (csDesigning in ComponentState) then
    begin
      if not InitInheritedComponent(Self, TDataModule) then
        raise EResNotFound.CreateFmt(SResNotFound, [ClassName]);
      if OldCreateOrder then DoCreate;
    end;
  finally
    GlobalNameSpace.EndWrite;
  end;
end;

It uses "GlobalNameSpace" singleton interface which is defined and implemented like this:

var
  GlobalNameSpace: IReadWriteSync;

initialization
{$IFDEF MSWINDOWS}
  GlobalNameSpace := TMultiReadExclusiveWriteSynchronizer.Create;
{$ENDIF}
...

So here is your lock. Now funny thing is why it happens? It seems like a standard piece of VCL code. Maybe you can share some more light into what the application is doing.

And more, you have the means to look for debug output. This is defined in the class:

procedure TMultiReadExclusiveWriteSynchronizer.BeginRead;
var
  Thread: PThreadInfo;
  WasRecursive: Boolean;
  SentValue: Integer;
begin
{$IFDEF DEBUG_MREWS}
  Debug('Read enter');
{$ENDIF}
...

So if "DEBUG_MREWS" is defined you will get debug info via "OutputDebugString". It should sheed some light into the problem. To me this looks like a components initialization race condition. :)

Runner
EurekaLog uses somekind of post processor to add function <-> address mapping, so it should be up-to-date.The first line of TRORemoteDataModule is "inherited", so no custom logic is called before TDataModule.Create is called.
Harriv
Then something else calls "BeginWrite". The problem obviously starts at "TMultiReadExclusiveWriteSynchronizer.BeginWrite". Can you check all application code for that function call. Maybe it is run in parallel or background.
Runner
Ok, I updated the answer with new info.
Runner
If you enable DEBUG_MREWS - this will have no effect, until you recompile RTL and VCL, which is... quite painfully. A more simper approach is to set logging breapoint as I've described in my answer. You set 4 breakpoints and run your app. No recompilation of RTL/VCL needed. Though my method have limitations: new Delphi only (no support for logging breakpoints in D7), Use Debug DCUs option should be enabled and no run-time packages.
Alexander
Hm, forgot that you have to recompile RTL/VCL. I agree this is more trouble than it is worth.
Runner
+1  A: 

The last call is in TMultiReadExclusiveWriteSynchronizer, but it does not show in which method. There it is probably calling windows sync functions, and it gets blocked there. Are you using RemObjects libraries? Something is happening there. You can use SysInternals Process Explorer configured to download Windows symbols and inspect the stack trace while the process is running, if you convert Delphi symbols in .dbg format you will have the whole explict call stack.

ldsandon
Yes, I'm using RemObjects. If we can trust the stack trace, TMultiReadExclusiveWriteSynchronizer is called when TRODataModule is created, which is inherited from TDataModule, which the acquires the GlobalNameSpace lock during Create.
Harriv
+1  A: 

Can you reproduce this hang under debugger? If yes - then set breakpoints on Begin/EndWrite/Read methods of GlobalNameSpace. Enter advanced properties of each breakpoint and turn of "Break" option and turn on "Log call stack" option.

Now, run your application and let it hang. Check the events window. You'll see full history of GlobalNameSpace's call. Analyze it. There must be some wrong call somewhere. Probably, the last one.

Alternatively, if you're on Vista and above, you can use Wait Chain Traversal feature, but I forgot, which Delphi has it. Just let your application hang and see Threads window: there should be reason for blocking.

In rare cases, hang can be caused by memory corruption. For example, bad code may erase/damage "Free"/"Busy" flag, so the first code, who will try to acquire a lock, will wait forever as there is nobody to release a lock (since it wasn't locked, it was damaged).

Alexander
I haven't managed to reproduce this in debugger or in development environment.
Harriv
Have you reproduced it at all? On client machine?
Alexander
+1  A: 

I've had the same problem with closing my application and creating a TForm or TDatamodule in another thread: because the app is terminating, the GlobalNameSpace (see other answer) is locked(!), so the create of my TRORemoteDataModule in another thread was locked. Because I used another lock to wait for that thread I had a deadlock :-(.

In your case you must take a look at other threads, to find the one that has placed the lock (and is probably waiting otherwise you wouldn't have deadlock). I see a "SyncSentrolUpdateTimerTimer": maybe you do a TThread.Synchronize in a create of a TDatamodule in an other thread?

EDIT: if you want to look at the stack of the other threads while in the deadlock:

  1. Try Process Explorer -> process -> thread -> stack (but you need map2dbg.exe to convert a Delphi .map file to a Microsoft .dbg file to see debug symbols)
    http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
    http://code.google.com/p/map2dbg/
  2. Try my open source AsmProfiler (sampling mode): it has a "live view" to see the stack of all threads, and it uses Delphi debug symbols (.map, .jdbg, etc) so you don't need map2dbg.exe (btw: I made this Live View because I use Processes explorer a lot and was annoyed to use map2dbg every time)
    http://code.google.com/p/asmprofiler/wiki/ProcessStackViewer
André
I've tried to avoid Synchronize calls, but I'll check.
Harriv
"I've tried to avoid Synchronize calls" - actually, this may be a reason for a problem. For example, if you've tried "too hard" - you may access VCL from non-main thread, which is a good reason for hung.
Alexander