views:

512

answers:

1

I was recently working on a windows program that would sometimes become unresponsive when scrolling through a large list of items in a production environment. Of course it works fine on my desktop. The production Environment is:

  • Windows XP based Workstation with 2 monitors
  • nVidia Video Drivers with nView enabled

Of note is a Dr watson stack trace generated when the process is terminated:

State Dump for Thread Id 0xef4

eax=00e3fff8 ebx=000000a0 ecx=00e00000 edx=00000000 esi=0003fff8 edi=00e40000
eip=00b920c2 esp=0012bcac ebp=00000000 iopl=0         nv up ei ng nz na pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000283

\system32\nview.dll - 
function: nview!NVLoadDatabase
        00b920a8 c80b0600         enter   0x60b,0x0
        00b920ac 83c30f           add     ebx,0xf
        00b920af 33f6             xor     esi,esi
        00b920b1 03f9             add     edi,ecx
        00b920b3 83e3f8           and     ebx,0xfffffff8
        00b920b6 3bcf             cmp     ecx,edi
        00b920b8 89742414         mov     [esp+0x14],esi
        00b920bc 734c             jnb     nview!NVLoadDatabase+0xcaf (00b9210a)
        00b920be 8bc1             mov     eax,ecx
        00b920c0 8b10             mov     edx,[eax]
        00b920c2 8b4004           mov     eax,[eax+0x4]     ds:0023:00e3fffc=00000000
        00b920c5 89442414         mov     [esp+0x14],eax
        00b920c9 8bc2             mov     eax,edx
        00b920cb 2500000001       and     eax,0x1000000
        00b920d0 33ed             xor     ebp,ebp
        00b920d2 0bc5             or      eax,ebp
        00b920d4 7414             jz      nview!NVLoadDatabase+0xc8f (00b920ea)
        00b920d6 8bc2             mov     eax,edx
        00b920d8 c1e008           shl     eax,0x8
        00b920db 8be8             mov     ebp,eax
        00b920dd c1f81f           sar     eax,0x1f

ChildEBP RetAddr  Args to Child              
00000000 00000000 00000000 00000000 00000000 nview!NVLoadDatabase+0xc67

Why did this problem only occur in production?

+6  A: 

This is interesting because nView is a 3rd party DLL provided by NVidia. Postings on the internet about nview!NVLoadDatabase suggest that there is an unpatched defect in nview. This is supported by the fact that explorer uses 100% CPU, as confirmed by these reports. See: http://forums.nvidia.com/lofiversion/index.php?t36879.html

A detailed investigation of this problem is available on this site: http://blogs.technet.com/marcelofartura/archive/2007/02/28/real-case-random-apps-running-100-cpu.aspx

As per this article, the hang is due to an infinite loop in nview.dll. Although the assembly instructions and register values described online do not exactly match those in our log, they were close enough for me to conclude that it is the same issue.

To work around the problem, I disabled nView Desktop Manager (Right click on the desktop, select nView Properties, and click disable in the nView Desktop Manager groupbox). Before doing this I was able to consistently reproduce the hang. However, after disabling nView I could not reproduce the hang. Thus, this appears to be a viable workaround.

Anyway, I posted this up here in case it will be useful to anyone. It caused me a LOT of grief chasing this one down.

Justin Ethier