Can someone give me step by step instructions or point to the correct references in the correct order so I can determine the root cause of this issue?
You can get a memory dump of the process and look into it WinDbg. It will at least give you a list of exceptions and the threads' current state(s). Doing so will recycle the process though. It is also possible to attach to a QA style machine in a remote debug mode from visual studio. However I've not done this, and it would hang every other request while you debug.
If w3wp is running locally, you can right click on the process in the task manager and select debug to get a look at in in WinDbg. Otherwise you want something like Debug Diag on your production/test machine to create a full user dump. See: http://msdnrss.thecoderblogs.com/2008/05/21/debugdiag-11-or-windbg-which-one-should-i-use-and-how-do-i-gather-memory-dumps/
I did all this back in February, and haven't needed to since. the full step by step is actually somewhat painful due to getting symbols for WinDbg and configuring environment variables for where they should be stored etc.
For information on setting up WinDbg for ASP.NET inspection look at this article: http://support.microsoft.com/kb/892277
w3wp.exe can consume a large amount of memory for a variety of reasons.
- Large number of requests to process
- Large volumes of data throughput (for example media processing)
- Memory leak
- Any combination of above.
If you suspect the first two being the problem, you will need to scale the system up (adding extra servers etc.)
A memory leak will cause a gradual increase of memory usage over time.
If you suspect a memory leak, following actions can be considered.
- Code review (looking especially at static objects, and event registration/unregistration)
- Profiling: profiling for memory leak can sometimes be a difficult task. Ensure that you collect data over some period of time / request while profiling and check for long living objects and check to ensure that their long life is valid.