views:

265

answers:

1

Hi

I have an application that I have converted to Delphi 2009 I have "String format checking" On and the standard memory manager. I downloaded the MS debugging tools at http://www.microsoft.com/whdc/devtools/debugging/install64bit.mspx and got some debug files but I am not sure what to make of them. I would like some pointers on where to go from here. Below is the top part of the debug file (bottom has all the drivers loaded);

Opened log file 'c:\debuglog.txt'
1: kd> .sympath srv*c:\symbols*http://msdl.microsoft.com/downloads/symbols
Symbol search path is: srv*c:\symbols*http://msdl.microsoft.com/downloads/symbols
Expanded Symbol search path is: srv*c:\symbols*http://msdl.microsoft.com/downloads/symbols
1: kd> .reload;!analyze -v;r;kv;lmnt;.logclose;q
Loading Kernel Symbols
...............................................................
................................................................
.........................
Loading User Symbols
Loading unloaded module list
........
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

UNEXPECTED_KERNEL_MODE_TRAP (7f)
This means a trap occurred in kernel mode, and it's a trap of a kind
that the kernel isn't allowed to have/catch (bound trap) or that
is always instant death (double fault).  The first number in the
bugcheck params is the number of the trap (8 = double fault, etc)
Consult an Intel x86 family manual to learn more about what these
traps are. Here is a *portion* of those codes:
If kv shows a taskGate
        use .tss on the part before the colon, then kv.
Else if kv shows a trapframe
        use .trap on that value
Else
        .trap on the appropriate frame will show where the trap was taken
        (on x86, this will be the ebp that goes with the procedure KiTrap)
Endif
kb will then show the corrected stack.
Arguments:
Arg1: 0000000000000008, EXCEPTION_DOUBLE_FAULT
Arg2: 0000000080050033
Arg3: 00000000000006f8
Arg4: fffff80001ee1678

Debugging Details:
------------------


BUGCHECK_STR:  0x7f_8

CUSTOMER_CRASH_COUNT:  4

DEFAULT_BUCKET_ID:  COMMON_SYSTEM_FAULT

PROCESS_NAME:  SomeApplication.e

CURRENT_IRQL:  1

EXCEPTION_RECORD:  fffffa60087b43c8 -- (.exr 0xfffffa60087b43c8)
.exr 0xfffffa60087b43c8
ExceptionAddress: fffff80001ed0150 (nt!RtlVirtualUnwind+0x0000000000000250)
   ExceptionCode: 10000004
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 00000000000000d8

TRAP_FRAME:  fffffa60087b4470 -- (.trap 0xfffffa60087b4470)
.trap 0xfffffa60087b4470
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000050 rbx=0000000000000000 rcx=0000000000000004
rdx=00000000000000d8 rsi=0000000000000000 rdi=0000000000000000
rip=fffff80001ed0150 rsp=fffffa60087b4600 rbp=fffffa60087b4840
 r8=0000000000000006  r9=fffff80001e4e000 r10=ffffffffffffff88
r11=fffff8000204c000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl zr na po nc
nt!RtlVirtualUnwind+0x250:
fffff800`01ed0150 488b02          mov     rax,qword ptr [rdx] ds:00000000`000000d8=????????????????
.trap
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff80001ea81ee to fffff80001ea8450

STACK_TEXT:  
fffffa60`005f1a68 fffff800`01ea81ee : 00000000`0000007f 00000000`00000008 00000000`80050033 00000000`000006f8 : nt!KeBugCheckEx
fffffa60`005f1a70 fffff800`01ea6a38 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiBugCheckDispatch+0x6e
fffffa60`005f1bb0 fffff800`01ee1678 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDoubleFaultAbort+0xb8
fffffa60`087b3c90 fffff800`01ea82a9 : fffffa60`087b43c8 00000000`00000001 fffffa60`087b4470 00000000`0000023b : nt!KiDispatchException+0x34
fffffa60`087b4290 fffff800`01ea70a5 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000001 : nt!KiExceptionDispatch+0xa9
fffffa60`087b4470 fffff800`01ed0150 : fffffa60`087b5498 fffffa60`087b4e70 fffff800`01f95190 fffff800`01e4e000 : nt!KiPageFault+0x1e5
fffffa60`087b4600 fffff800`01ed3f78 : fffffa60`00000001 00000000`00000000 00000000`00000000 ffffffff`ffffff88 : nt!RtlVirtualUnwind+0x250
fffffa60`087b4670 fffff800`01ee1706 : fffffa60`087b5498 fffffa60`087b4e70 fffffa60`00000000 00000000`00000000 : nt!RtlDispatchException+0x118
fffffa60`087b4d60 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDispatchException+0xc2


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt!KiDoubleFaultAbort+b8
fffff800`01ea6a38 90              nop

SYMBOL_STACK_INDEX:  2

SYMBOL_NAME:  nt!KiDoubleFaultAbort+b8

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  49e0237f

FAILURE_BUCKET_ID:  X64_0x7f_8_nt!KiDoubleFaultAbort+b8

BUCKET_ID:  X64_0x7f_8_nt!KiDoubleFaultAbort+b8

Followup: MachineOwner
---------

rax=fffffa60005f1b70 rbx=fffffa60087b43c8 rcx=000000000000007f
rdx=0000000000000008 rsi=fffffa60087b4470 rdi=fffff80001f9bfa4
rip=fffff80001ea8450 rsp=fffffa60005f1a68 rbp=fffffa60005f1c30
 r8=0000000080050033  r9=00000000000006f8 r10=fffff80001ee1678
r11=fffffa60087b4468 r12=0000000000000000 r13=fffffa60087b4290
r14=fffff8000205149c r15=fffff80001e4e000
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00000282
nt!KeBugCheckEx:
fffff800`01ea8450 48894c2408      mov     qword ptr [rsp+8],rcx ss:0018:fffffa60`005f1a70=000000000000007f
Child-SP          RetAddr           : Args to Child                                                           : Call Site
fffffa60`005f1a68 fffff800`01ea81ee : 00000000`0000007f 00000000`00000008 00000000`80050033 00000000`000006f8 : nt!KeBugCheckEx
fffffa60`005f1a70 fffff800`01ea6a38 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiBugCheckDispatch+0x6e
fffffa60`005f1bb0 fffff800`01ee1678 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDoubleFaultAbort+0xb8 (TrapFrame @ fffffa60`005f1bb0)
fffffa60`087b3c90 fffff800`01ea82a9 : fffffa60`087b43c8 00000000`00000001 fffffa60`087b4470 00000000`0000023b : nt!KiDispatchException+0x34
fffffa60`087b4290 fffff800`01ea70a5 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000001 : nt!KiExceptionDispatch+0xa9
fffffa60`087b4470 fffff800`01ed0150 : fffffa60`087b5498 fffffa60`087b4e70 fffff800`01f95190 fffff800`01e4e000 : nt!KiPageFault+0x1e5 (TrapFrame @ fffffa60`087b4470)
fffffa60`087b4600 fffff800`01ed3f78 : fffffa60`00000001 00000000`00000000 00000000`00000000 ffffffff`ffffff88 : nt!RtlVirtualUnwind+0x250
fffffa60`087b4670 fffff800`01ee1706 : fffffa60`087b5498 fffffa60`087b4e70 fffffa60`00000000 00000000`00000000 : nt!RtlDispatchException+0x118
fffffa60`087b4d60 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiDispatchException+0xc2
+1  A: 

The help file for windbg goes into some more detail on the various kernel mode bug checks, and what to do about them. I don't really know your level of expertise or what you are expecting here, but generally speaking nothing you can do in a user mode program such as delphi will ever cause a bug check. Therefore we would normally go with the assumption of a driver bug, or some kind of hardware failure.

I entered UNEXPECTED_KERNEL_MODE_TRAP into the help index and got this page:

Windows Driver Kit: Debugging Tools Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP The UNEXPECTED_KERNEL_MODE_TRAP bug check has a value of 0x0000007F. This bug check indicates that the Intel CPU generated a trap and the kernel failed to catch this trap.

This trap could be a bound trap (a trap the kernel is not permitted to catch) or a double fault (a fault that occurred while processing an earlier fault, which always results in a system failure).

elided...

0x00000008, or Double Fault, indicates that an exception occurs during a call to the handler for a prior exception. Typically, the two exceptions are handled serially. However, there are several exceptions that cannot be handled serially, and in this situation the processor signals a double fault. There are two common causes of a double fault:

A kernel stack overflow. This overflow occurs when a guard page is hit, and the kernel tries to push a trap frame. Because there is no stack left, a stack overflow results, causing the double fault. If you think this overview has occurred, use !thread to determine the stack limits, and then use kb (Display Stack Backtrace) with a large parameter (for example, kb 100) to display the full stack.

A hardware problem.

Cause Bug check 0x7F typically occurs after you install a faulty or mismatched hardware (especially memory) or if installed hardware fails.

A double fault can occur when the kernel stack overflows. This overflow occurs if multiple drivers are attached to the same stack. For example, if two file system filter drivers are attached to the same stack and then the file system recurses back in, the stack overflows.

elided...

It goes on into a lot more detail about that, and the various debugging techniques and what you can do to troubleshoot the problem.

1800 INFORMATION
Thanks for this, I was under the impression also that a user mode application couldn't cause a BSOD but wasn't quite sure, it seemed too much of a coincidence that it only happened exactly with Delphi 2009 at the same point in the app after multiple reboots, but the memory must have been just in the right place. I thought Vista 64 bit had address randomization. Anyway it turns out the memory on that machine is failing (tested with MS "Windows Memory Diagnostic" and MemTest86+. Thanks again.
Bruce
The above description is not correct and also MSDN is wrong on this. What it means is that the kernel stack size was not sufficient. In addition the kernel stack is guarded by an invalid page. Therefore when you hit an invalid address on that page the processor generates a page fault, tries to push registers and gets a second page fault. This is what “double fault” means.
steve