views:

104

answers:

2

We have a large legacy VB app made up of a number of DLLs (a couple of dozen or so), all installed into a single COM+ Server Application. Every now and then, something happens that causes dllhost.exe to keel over (and automatically restart), leaving this message in the Windows Application Event log...

The system has called a custom component and that component has failed and generated an exception. This indicates a problem with the custom component. Notify the developer of this component that a failure has occurred and provide them with the information below.
Server Application ID: {8CC02F18-2733-4A17-9E5C-1A70CB6B6977}
Server Application Instance ID: {1940A147-8A5E-45FA-86FE-DAF92A822597}
Server Application Name: MyTestApp
The serious nature of this error has caused the process to terminate.
Exception: C0000005
Address: 0x758DA3DA

Source: Complus
Event ID: 4786
Level: Error

Along side this is another log, specifically on dllhost.exe...

Faulting application name: dllhost.exe, version: 6.0.6000.16386, time stamp: 0x4549b14e
Faulting module name: msvcrt.dll, version: 7.0.6002.18005, time stamp: 0x49e0379e
Exception code: 0xc0000005
Fault offset: 0x0000a3da
Faulting process id: 0x83c
Faulting application start time: 0x01cb50c507ee0166
Faulting application path: %11
Faulting module path: %12
Report Id: %13

I know it's flagging a failure in the C runtime (msvcrt), but ideally I need to trace this back into the DLL that's called into msvcrt (probably with bad data/parameters). So without installing a debugger, is there any way to identify the DLL that causes this? I'm trying to see if there's a memory dump anywhere I can use to analyse offline - and thus tie the Address to something specific. But without that, I'm not sure that's possible. Can the COM subsystem be told to generate a minidump when a hosted application crashes? (yes it can [probably] - there's a checkbox on the 'Dump' tab).

This is on Windows Server 2008 R1 32-bit (but also be interested for Server 2003 as well).

It doesn't affect availability of the app -- COM+ simply restarts dllhost and the application continues, but it is an inconvienience that would be useful to fix.

Edit Okay, I've got a crash dump, I've got windbg, but it's not helping. Not sure if I'm being thick (a possibility) or something else :-) Output of !analyze -v is below , but it's not showing me anything in our DLLs, although it looks like it hasn't been able to resolve FAULTING_IP? I'm not sure where to turn next.

I'm wondering if any of my pdb's are dodgy and be worth generating new ones -- hooked into Microsoft's symbol server, so they shouldn't be, but not sure for what module it's (apparently) reporting wrong symbols for (BUGCHECK_STR and PRIMARY_PROBLEM_CLASS) (or are these symbols on the server the code was originally running on?). Would it be better to put the PDBs on the server itself?

If not, any other ideas? I've used windbg briefly before, but I'm no regular user of it, so maybe there's some more incantations I need to type to dig deeper? Guidance welcome :-)

*******************************************************************************
*                                                                             *
*                        Exception Analysis                                   *
*                                                                             *
*******************************************************************************

FAULTING_IP: 
+5c112faf02e0d82c
00000000 ??              ???

EXCEPTION_RECORD:  ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 00000000
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 0

FAULTING_THREAD:  00000f1c
DEFAULT_BUCKET_ID:  WRONG_SYMBOLS
PROCESS_NAME:  dllhost.exe
ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint has been reached.
EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid
MOD_LIST: <ANALYSIS/>
NTGLOBALFLAG:  0
APPLICATION_VERIFIER_FLAGS:  0
MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0xf1c (0)
Current frame: 
ChildEBP RetAddr  Caller,Callee

LAST_CONTROL_TRANSFER:  from 77b15620 to 77b15e74
PRIMARY_PROBLEM_CLASS:  WRONG_SYMBOLS
BUGCHECK_STR:  APPLICATION_FAULT_WRONG_SYMBOLS

STACK_TEXT:  
0022fa68 77b15620 77429884 00000064 00000000 ntdll!KiFastSystemCallRet
0022fa6c 77429884 00000064 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0022fadc 774297f2 00000064 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xbe
0022faf0 778e2c44 00000064 ffffffff 00e42374 kernel32!WaitForSingleObject+0x12
0022fb0c 778e2e32 00060848 0022fb5b 00000000 ole32!CSurrogateProcessActivator::WaitForSurrogateTimeout+0x55
0022fb24 00e413a4 0022fb40 00000000 00061d98 ole32!CoRegisterSurrogateEx+0x1e9
0022fcb0 00e41570 00e40000 00000000 00061d98 dllhost!WinMain+0xf2
0022fd40 7742d0e9 7ffde000 0022fd8c 77af19bb dllhost!_initterm_e+0x1a1
0022fd4c 77af19bb 7ffde000 dc2ccd29 00000000 kernel32!BaseThreadInitThunk+0xe
0022fd8c 77af198e 00e416e6 7ffde000 ffffffff ntdll!__RtlUserThreadStart+0x23
0022fda4 00000000 00e416e6 7ffde000 00000000 ntdll!_RtlUserThreadStart+0x1b

STACK_COMMAND:  .cxr 00000000 ; kb ; dt ntdll!LdrpLastDllInitializer BaseDllName ; dt ntdll!LdrpFailureData ; ~0s; .ecxr ; kb

FOLLOWUP_IP: 
dllhost!WinMain+f2
00e413a4 ff15a410e400    call    dword ptr [dllhost!_imp__CoUninitialize (00e410a4)]

SYMBOL_STACK_INDEX:  6
SYMBOL_NAME:  dllhost!WinMain+f2
FOLLOWUP_NAME:  MachineOwner
MODULE_NAME: dllhost
IMAGE_NAME:  dllhost.exe
DEBUG_FLR_IMAGE_TIMESTAMP:  4549b14e
FAILURE_BUCKET_ID:  WRONG_SYMBOLS_80000003_dllhost.exe!WinMain
BUCKET_ID:  APPLICATION_FAULT_WRONG_SYMBOLS_dllhost!WinMain+f2
+1  A: 

Do you have symbols for the VB dlls? Symbols are important to get the call-stack. I hope you have correct symbols. You can use ld * and then lme which should get you list of symbols that did not match within windbg. Also set the symbol path for MS symbols and as well as for your custom code using _NT_SYMBOL_PATH

One of the easiest option is to load the dump within DebugDiag which should give you reason for the failure along with call-stack. DebugDiag has debugger extensions for Complus.

And here is a command to native call stack for all the threads

~*ek

and this one switch to the current exception

.ecxr
Naveen
Ah - DebugDiag - that's given me something to work with :-) It's shown the PDBs are (probably) correct as well. I'll be able to confirm tomorrow.
Chris J
A: 

Debug Mon / WinDbg is the best way to troubleshoot this issue. you should be able to use the modules list in winDbg, or the lm command to list the loaded modules. The stack trace should then tell you which DLLs are involved. This should be possible even without the symbols for the process/dll.

Mike
The problem was that I couldn't see any way to easily locate the thread that threw the exception in windbg -- you can see from the dump above the !analyze didn't give me anything useful to work with. Naveen's pointed me to DebugDiag which did locate the thread, and from there, I've been able to dig in further. Be nice to know however if there is an easy way to find the exception amongst 40 odd threads, or is it simply a case of switch to each thread and look at the backtrace manually?
Chris J
Updated my answer with the commands to get call-stacks and display current exception in the context.
Naveen