views:

1625

answers:

8

Sorry, I couldn't figure out a good way to phrase my real question.

I run a high-traffic ASP.NET site on a 64-bit machine. I have IIS running in 32-bit mode, however, due to some legacy components of the app. I am running this particular web app inside an application pool that has the web garden option on (running 6 processes inside an 8 core machine).

Once or twice a week one of the processes will skyrocket into 100% CPU utilization, causing a giant slowdown for the site, so my plan was to wait until that happens, memory dump the offending process, then poke around WinDbg to zero in on the thread that's spiking to see where the code is spinning its wheels.

I've debugged using WinDbg before to figure out what was causing a deadlock on the site, but that was several months ago and I can't remember how I got it working. (As a side note, this is a lesson to document everything you do.)

I'm running WinDbg on the Windows 2003 server that's running the site, so as to prevent any DLL version problems. Here have been my steps so far, please let me know where I'm going wrong to get the error message that I'm getting.

  1. I first memory dump the spiking process using UserDump, with the following command, where 3389 is the ID of the process:

    userdump -k 3389

  2. I load the dump into the x86 edition of WinDbg.

  3. Since I'm running 32-bit on a 64-bit machine, I first load the memory dump and then:

    .load wow64exts

    .effmach x86

  4. I make sure that my symbol path includes the directory that contains my apps PDB files:

    .sympath+ c:\inetpub\myapp\bin

  5. Running just `.load SOS' fails with an error of "The system cannot find the file specified", so I go the fully qualified route of the following, which works:

    .load c:\windows\microsoft.net\framework\v2.0.50727\sos

From here, I'm lost. I try any of the SOS commands, like !threads, only to get this error:

Failed to load data access DLL, 0x80004005

That error is also accompanied by the numbered list of items that I should be verifying. I have verified that I am running the most current version of the debugger, mscordacwks.dll is in fact in the same directory as the mscorwks.dll file, and I'm debugging on the same architecture as the dump file.

I've also run the magical ".cordll -ve -u -l" command, but that doesn't solve anything. I'm always greeted with "CLR DLL status: No load attempts" when I execute that. Then I try ".reload", which yields a handful of warnings like "WARNING: wldap32 overlaps dnsapi". I wish it said something like "CLRDLL: Loaded DLL C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscordacwks.dll". But it doesn't.

What in the world, people? Debugging shouldn't be a hair-pulling-out process just to even start debugging!

+2  A: 

Try executing !sw before running the sos commands. See this blog post.

Rob Walker
I think !sw just toggles between 32-bit and 64-bit mode, which is essentially the same thing as `.effmach x86`. I tried it anyway, but it didn't help. Thanks for giving me something to try, though.
Ken Randall
+2  A: 

In my experience, spiking app pool can be due to it being recycled. Have you tried IIS Crash / Hang agent and IIS Dump ?

http://www.microsoft.com/downloads/details.aspx?FamilyID=01c4f89d-cc68-42ba-98d2-0c580437efcf&DisplayLang=en

Also included with them is a dumpfile analyzer which will tell you about memory leaks and even suggest areas of your code that need fixing (complete with links to the applicable MSKB articles!)

Dave R
A: 

The symptom you describe sounds like a GC2 collection.

Mitch Wheat
A: 

Dude - not sure if this helps, but maybe try this.

  1. Copy c:\windows\microsoft.net\framework\v2.0.50727\sos.dll to the same directory where windbg is installed to (eg. c:\program files\Debugging Tools for Windows\ ). Why? make it easy to load the sos file
  2. run windbg
  3. load the memory dump file. for me, i use ctrl-D or File -> open crash dump
  4. .load sos <-- take note of the fullstop BEFORE the load command
  5. .symfix c:\temp\debug_symbols
  6. .reload

Ok.. take note of the commandline. this tells me the current THREAD that the dump was in. That might be useless for a HIGH CPU scenario .. because we could be in any thread.

so from here i look at the threads that were running and check out the busiest thread

8 !threadpool <-- this is so i can see the cpu utilization to check we are in a crap (busy) state... eg 100% cpu or what not.

9 !runaway <-- list the threads that have ben around the longest... eg.

0:027 !runaway
User Mode Time
Thread       Time
18:704       0 days 0:00:17.843   <-- Thread #18
19:9f4       0 days 0:00:13.328   <-- Thread #19
16:1948      0 days 0:00:10.718
26:a7c       0 days 0:00:01.375
24:114       0 days 0:00:01.093
27:d54       0 days 0:00:00.390
28:1b70      0 days 0:00:00.328
0:b7c       0 days 0:00:00.171
25:3f8       0 days 0:00:00.000
23:1968      0 days 0:00:00.000

thread 18 and 19 have been hanging around awhile.. hmm.... are they stuck in a loop?

  1. ~18s <-- goto thread 18.
  2. !clrstack <-- clr call stack .. which is just like debugging in windows.

.. and from here u can dump objects and stuff by giving the address references and stuff.

check out !help to list some commands to try and use .. i think !help.sos also works?

HTH .. if u still get stuck, ask away at what worked and what didn't.

Pure.Krome
+1  A: 

Have you had a look at Tess's blog?: it's a gold mine of info.

Mitch Wheat
A: 

Her blog is my sauce of all things awesome (boom tish) in the windbg world. i would not have known how to do any of this stuff without her tutorials and posts.

Tess - u're a saviour to us dev's who are crazy (or unfortunate) enough to have to get nasty with crash dump files.

Pure.Krome
+1  A: 

I just had to deal with a similar problem. In my case, it turned out that WinDbg wasn't able to find the correct version of mscorwks.dll. In addition to the Framework version, there is also a revision of the DLL which can be different between the same framework version.

In theory, the Microsoft symbol servers should be able to supply the necessary DLL, but it wasn't happening for me. To solve it, I used !sym noisy to get additional information on symbol loading. When I did !dumpstack, I got the error message:

SYMSRV: http://msdl.microsoft.com/download/symbols/mscorwks.dll/492B82C1590000/mscorwks.dll not found

To fix this, I created the appropriate folders in my local symbol cache, and copied mscorwks.dll from the machine the dump came from. After a .reload, WinDbg found the necessary DLL in the local symbol cache, and continued on happily.

Alternatively, you can find the exact version of mscorwks being used with lm v m mscorwks. You can then find the update that contains the version you need from this list. You will need to extract the necessary DLLs from the particular update to the right location.

Chris Ostler