views:

52

answers:

3

Is there any solution for 10038 server error .i have done coding in c++; the server runs fine 10 to 12 hours but sudenly it gives 10038 socket error

+4  A: 

Without seeing your code: the symptom you describe sounds like you are leaking memory/resources, i.e. you are forgetting to free/delete objects you are allocating. It could also be a timing issue. I suggest you post your (cut-down) code.

10038 (WSAENOTSOCK): Socket operation on nonsocket. An operation was attempted on something that is not a socket. Either the socket handle parameter did not reference a valid socket, or for select, a member of an fd_set was not valid.

Mitch Wheat
A: 

There may be two reasons for this:

  1. Your socket descriptor in uninitialized (i.e. doesn't reference a valid socket).
  2. You closed this socket (by a call to closesocket), and still try to use it.

Such an error is always a bug, it's not related to the real network activity/state and etc. This is equivalent (in some sense) to either trying to use a resource/memory after you free it, or simply referencing an uninitialized pointer.

So that in order to solve the 10038 you must fix your code.

P.S. If you have a multi-threaded application - it's likely that you close the socket in one thread, whereas the other thread still trying to use it.

Anyway, there's a good practice to initialize socket descriptors to INVALID_SOCKET at the beginning. Also set it to INVALID_SOCKET immediately after you close it.

Then, before trying to use it you may check if the socket is valid. In such a way you may find the problematic scenario.

valdo
A: 

I bet you are accessing a socket that you already closed. This is a very common timing bug in WinSock programming - the good news (and bad news, because it's hard to reproduce) is that you are not hitting it very often so it's likely your code does not need much work to make it perfect. I think you should add thread-safe diagnostics that output a string including the socket value (an int, basically) on every open and close, and from anywhere you see this 10038 or other unexpected errors.

If you can add those diagnostics and then set up a stress test that focuses on open and close areas in your program (you may need to strip down the code to a small subset for unit testing of the sockets handling, maybe doing this back-to-back on localhost, or to two LAN-connected machines) then it will likely manifest much more quickly than 10-12 hours and you may find and fix other timing windows along the way. The goal is to try to compress 10-12 hours of 'normal' socket activity into as small a space of time as possible, to really expose any hard-to-detect concurrency problems.

Steve Townsend