I have a problem with a piece of legacy c++/winsock code that is part of a multi-threaded socket server. The application creates a thread that handles connections from clients, of which there are typically a couple of hundred connected at any one time. It typically runs without a problem for several days (continuously), and then suddenly stops accepting connections. This only happens in production, never test.
It uses WSAEventSelect() to detect FD_ACCEPT network events. The (simplified) code for the connection handler is:
SOCKET listener;
HANDLE hStopEvent;
// ... initialise listener and hStopEvent, and other stuff ...
HANDLE hAcceptEvent = WSACreateEvent();
WSAEventSelect(listener, hAcceptEvent, FD_ACCEPT);
HANDLE rghEvents[] = { hStopEvent, hAcceptEvent };
bool bExit = false;
while(!bExit)
{
DWORD nEvent = WaitForMultipleObjects(2, rghEvents, FALSE, INFINITE);
switch(nEvent)
{
case WAIT_OBJECT_0:
bExit = true;
break;
case WAIT_OBJECT_1:
HandleConnect();
WSAResetEvent(hAcceptEvent);
break;
case WAIT_ABANDONED_0:
case WAIT_ABANDONED_0 + 1:
case WAIT_FAILED:
LogError();
break;
}
}
From detailed logging I know that, when the problem occurs, the thread enters WaitForMultipleObjects() and never emerges, even though there are clients attempting to connect and waiting for an accept. The WAIT_FAILED and WAIT_ABANDONED_x conditions never occur.
While I haven't ruled-out a config problem on the server, or even some kind of resource leak (can't find anything), I am also wondering if the event created by WSACreateEvent() is somehow being 'dissassociated' from the FD_ACCEPT network event - causing it to never fire.
So, am I doing something wrong here? Is there something I should be doing that I'm not? Or a better way? I'd appreciate any suggestions! Thanks.
EDIT
The socket is a non-blocking socket.
EDIT
Problem solved by using the approach suggested by kipkennedy (below). Changed hAcceptEvent to be an auto-reset event, and removed the call to WSAResetEvent() which was no-longer needed.