tags:

views:

724

answers:

3

We had an incident in which I called Microsoft Support and they were able to use WinDBG to analyze one of my mini-dumps and identify the exact problem occurring. I WinDBG analyzed the same dump and could not get past the stack trace. I'm assuming I am unable to find the golden nugget because I'm ignorant, but Microsoft won't tell me what they did to unearth it themselves. Any chance I can get some help over the tough bits here?

The situation involved a .NET 1.1 call to a vendor-provided web service. For 1 hour per night for weeks we were unable to authenticate against the service, but the connection did not fail. During each outage, we hung dozens of threads until the service came back online.

If I DebugDiag and Report, I can see that thread 49 is hung, and run !clrstack against that thread.

0:049> !clrstack
succeeded
Loaded Son of Strike data table version 5 from "C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorsvr.dll"
Thread 49
ESP EIP
1382ec64 7c82860c [FRAME: NDirectMethodFrameStandalone] [DEFAULT] I4 System.Net.UnsafeNclNativeMethods/OSSOCK.recv(I,I,I4,ValueClass System.Net.Sockets.SocketFlags)
1382ec78 10fb1fef [DEFAULT] [hasThis] I4 System.Net.Sockets.Socket.Receive(SZArray UI1,I4,I4,ValueClass System.Net.Sockets.SocketFlags)
1382ecb8 10fb1e65 [DEFAULT] [hasThis] I4 System.Net.Sockets.NetworkStream.Read(SZArray UI1,I4,I4)
1382ece4 10fb1dd1 [DEFAULT] [hasThis] I4 System.Net.TlsStream.ForceRead(SZArray UI1,I4,I4)
1382ed00 10fb1cc4 [DEFAULT] [hasThis] SZArray UI1 System.Net.TlsStream.ReadFullRecord(SZArray UI1,I4)
1382ed20 10a6f7df [DEFAULT] [hasThis] Class System.Exception System.Net.TlsStream.Handshake(Class System.Net.ProtocolToken)
1382ed44 10a6f59b [DEFAULT] [hasThis] Void System.Net.TlsStream..ctor(String,Class System.Net.Sockets.Socket,Boolean,Class System.Security.Cryptography.X509Certificates.X509CertificateCollection)
1382ed5c 10a6f4d0 [DEFAULT] [hasThis] ValueClass System.Net.WebExceptionStatus System.Net.Connection.ConstructTlsChannel(String,Class System.Net.HttpWebRequest,ByRef Class System.Net.Sockets.NetworkStream,Class System.Net.Sockets.Socket)
1382ed78 10a6f47b [DEFAULT] [hasThis] ValueClass System.Net.WebExceptionStatus System.Net.Connection.ConstructTransport(Class System.Net.Sockets.Socket,ByRef Class System.Net.Sockets.NetworkStream,Class System.Net.HttpWebRequest)
1382edac 10a693d7 [DEFAULT] [hasThis] Void System.Net.Connection.StartConnectionCallback(Object,Boolean)
1382f028 791b7f92 [FRAME: ContextTransitionFrame]

(!clrstack -p does not work for me. It returns exactly the same information as not asking for params. I assume this is because I don't have private symbols for the code. !do also does not work for me, though !dumpobj does. I loaded sos via ".loadby sos mscorsvr", not mscorwks, since I am running on a server. Could my sos load be wrong in some way?)

Anyway, Microsoft was kind enough to tell me parts of the things they found. They told me the stack trace they pulled, and I have pulled the same one. (That's cool.) From the stack trace, though, they extracted the following information. How?

- So the above thread is waiting on a socket. The socket details are mentioned below
SOCKADDR @ 01285dc0
sin_family = 2 (IP)
sin_port = 443
sin_addr = 206.16.40.219

And then they told me the name of the hung object so I could dump it, and I can.

0:049> !dumpobj 0x09278dbc
Name: System.String
MethodTable 0x79b946b0
EEClass 0x79b949fc
Size 140(0x8c) bytes
mdToken: 0200000f (c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll)
String: https://www.vendorname.com/services/v2006/Authentication

How did they get from that stack trace to identifying those objects without private symbols? As an admin, I can't just compile this code in debug mode nor would I deploy the debug code into production except as a last resort. Microsoft had exactly the same information I have, and they found the answer, so I assume it can be found if I can just get over the ignorance hump.

(Per one answer, I add that my WinDBG Symbol Search Path says: SRV*D:\Tools\Debuggers\Symbols*http://msdl.microsoft.com/download/symbols

Thank you.

+1  A: 

They used Symbol Server to get symbols.

Kirill V. Lyadvinsky
Thank you for the thought. Would this still be a problem if I have the Symbol Search Path configured per the article you link? I configured that some time ago.
codepoke
+1  A: 

They probably have local copies of all of the symbol files.

You can download them here, put them on your local system, and then load them in your debugger by typing:

.symfix c:\YourLocalSymbols

.reload

AaronS
I have WinDBG connected to Microsoft's symbol server, so I don't expect this to make much difference. WinDBG has pulled in 81mb of symbols, so I expect I've got everything Microsoft can give me. Please correct me if I'm wrong.
codepoke
You should technically have the same results then. If you didn't see the symbols loaded, then you should do the above. Also, regarding loading SOS. This is what I do: .load C:\WINDOWS\Microsoft.NET\Framework64\v2.0.50727\sos.dll to grab the exact dll I want.
AaronS
+2  A: 

My guess is they dumped the socket object to look at its internal fields. You could use !dso to dump the addresses of all stack objects, or !dumpheap -type System.Net.Sockets.Socket to get all Socket objects in memory.

Knowing the internals of the objects helps a lot here. Given the .NET source code, or a decompilation producted by the .NET Reflector, would help understand the internals of the socket object.

Dumping the socket object would give you the memory addresses of the fields m_RemoteEndPoint and m_RightEndPoint. One of those probably gave them the IP address, port, and family.

Paul Williams
Brilliant. If you don't mind, I'd like to wait a little while before marking this as the answer but it worked. Dumping the sockets didn't help me much, but I dumped the HttpWebRequest objects, then chased down through URI and absolute URI to find the offending link. If only I could drill down from the trace to the HttpWebRequest directly somehow. Still the "-type" command was beautiful. Thank you.
codepoke
Bingo! Here's the trail:~49s [switch to the hung thread]!dumpstackobjects [Gives you the HttpWebRequest obj]!dumpobj 09278e84 [Contains the URI]!dumpobj 09278cb0 [Contains the AbsoluteURI]!dumpobj 09278dbc [Holds the golden nugget! Straight path!]Golly, this stuff is fun! And I can replicate it again with a disimilar problem! Thank you, Paul.
codepoke