views:

73

answers:

0

We have a C++ client/server application in which the client retrieves and renders 3D content from a server.

Our client disconnects from the server very often (more than 50% of runs after less than 1 minute) with recv failing and WSAGetLastError returning WSAECONNABORTED. But the strange thing is that this happens only when:

  • the client has an NVIDIA Quadro GPU (happening on multiple machines with various versions of Quadro)
  • our application uses OpenGL for rendering

We didn't see this on a machine with another GPU. Switching to Direct3D for rendering also makes the disconnects go away. On one Quadro machine this started appearing (less often but still more than once a day) after a driver update pushed through Windows update. Reverting the driver helped.

The server also fails in recv with WSAECONNRESET. According to this answer we are doing things wrong on the server side, where we indeed do shutdown and then immediately close. But since the server gets WSAECONNRESET I would say that the client forcibly disconnects first so the problem is not initiated by the server doing a bad close.

We'll probably fix the server shutdown + close problem in the long term but we would like some short term workaround or explanation for the behavior on Quadros. Maybe it's a driver bug that somehow interacts with Winsock, even though it sounds very wacky.

Am I right that server WSAECONNRESET means that the client disconnected first?

Any ideas on how to troubleshoot potential weird bugs that the GPU driver would cause to sockets? I don't expect a specific solution to our weird problem of course, but any idea would do.