tags:

views:

153

answers:

2

We have an IIS hosted web method which is randomly dying on us about 10% of the time. In trying to debug this we've added Log.Debug() messages in front of every real code line and it appears to be dying on random lines.

Has anyone seen this or have an idea on how to debug this?

[Additional Details]

We've spent a lot of time looking at it and have discovered the following...

  1. We have a seperate self-hosted WCF Service that access the same database and lives on the same machine. When it is under heavy load the web method croaks every time. If it's not under load then things usually work fine (but not 100%).

  2. High CPU doesn't seem to be part of the problem. We ran a small app that created a high cpu load and the web service did not die.

  3. The web service dies when we either new up an XmlSerializer (without doing the sgen precomp) OR have NHibernate create a SessionFactory. The only two things these things have in common is that they 1) seem like things people commonly do.. 2) seem like they would be fairly intensive.

  4. We've added a Global.asax to try to capture Application_End and Application_Error but neither event gets fired. This to me implies that we're not dealing with a normal application pool resetting?

A: 

Sounds like it might be a threading issue. You are using informative debug messages -- you should try to reproduce the issue while running the debugger and breaking on all exceptions. Make sure you check all the windows logs for information on why the app pool crashed.

Per comment: It's hard to say, but many things can cause a thread to appear to "just die." Memory issues: are you doing any interop? Improper marshaling: are you touching data on another thread? But, I will play the probabilities and ask if you're sure your handling any exception that might be happening and logging it. Are you sure you are not gobbling up an exception and not reporting it? Somewhere down low? Is this a permissions issue? Are you running partial trust or on a low privilege user account?

JP Alioto
Recreating the situation in our dev environment would be very difficult but I think we're nearing the point where we'll have to look at it as an option.In terms of threading that was my thought as well but what would cause an IIS thread to just die? We don't see any messages in the Windows event log having to do with IIS.. this seems.. odd? Shouldn't there at least be a startup message?
ShaneC
A: 

Figured it out.. two problems really..

  1. We added Global.asax but it didn't get copied over which explains why we weren't seeing any messages. We fixed this and found out that...

  2. Our WCF log was being written out to the bin directory of the IIS Web Service. In retrospect this is kind of silly since the WS is an old school web service. The WCF stuff is in the same directory only for some reason that is unknown to us since the initial person who set things up is gone..

Lesson learned.. Somewhere there is a message that explains everything.. you just have to find it.

ShaneC