tags:

views:

498

answers:

2

I have a TCP listener service to which the clients connect. Lately I have started receiving this error related to disconnection. I connect around 20 clients to it and the connection works fine. But when I connect another 10 clients to the service, the previous connections break with a 10053 or 10054 error.

Previously it used to run with 100 clients but I am not sure what could be the problem. The service and the clients both are running on Windows Server 2003 because I found that Windows XP has a known problem with multiple TCP connections (related to 10053).

A: 

Well, the errors you are receiving are very different.

10053 is a WSAECONNABORTED - The connection was aborted. This is usually due to a problem in your application stack (though it just happens sometimes).

10054 is a WSAECONNRESET - The connection was reset by peer. This is usually more an issue on the other side of the connection.

How are you testing this? Are the "clients" connecting to this service something you wrote? If so, you should track what's happening on the client side when you get a 10054.

Also, this could potentially be due to network issues, unrelated to your software (directly). Has there been a change in the network infrastructure on which you are running?

Reed Copsey
I am testing with a custom client that creates a number of connections (threads) to the service and also through GNSS Surfer i.e. a NTRIP client to test NTRIP servers.10054 seems reasonable as it might come when the client disconnect the connection but the real problem is 10053. Previously I found that 10053 is related to Windows XP but now the problem can also be seen in Windows Server 2003.
A9S6
10053 is not specific to Windows XP - it's usually related to networking issues. Are you running your clients locally (from localhost), or across your network to this system? Is it running in a local network, on a single machine, or across the internet?
Reed Copsey
This link says the problem is with XP: http://support.microsoft.com/kb/938566
A9S6
Also, the connections are made locally and not from a different network.
A9S6
The error code can happen anywhere. There was a bug in XP (fixed by that hotfix) that caused it to happen unnecessarily.
Reed Copsey
10053 can happen if you're service is blocking too much, too. If it can't respond to the network with a valid TCP received message because it's too blocked, you might get this problem. Are you threading your service?
Reed Copsey
Yes!! we are using multi-threading in the service. For each connection a new thread is created. From inside this primary thread, data is sent from server to client and a async client.BeginReceive is called in which the data is sent from client to server.How can I find out whether the service is blocking too much? Many of my methods are "locked" to avoid being written by multiple threads simultaneously.
A9S6
Difficult to know without seeing your code, but you could add some form of timing to see how long it takes to respond to clients. I'd focus on your locks - make sure you aren't locking more than necessary, and in particular, make sure you aren't doing anything that could cause a deadlock. A good profiler could help here, potentially.
Reed Copsey
A: 

I doubt it is a network problem, or you probably would see it happening when the first 20 clients connect. Just a shot in the dark, but how are you handling these connections? Are you using some sort of Array or Collection? Could you inadvertently be setting existing connections to new connections, causing the system to freak out?