views:

36

answers:

1

Whats the best practice for scalable servers which need to maintain a list of active users?

  • Should I open a persistent TCP Connection for each client on which the server sends update messages? This could lead in many open connection and propably no traffic for many seconds. Is this a problem in TCP?
  • Or would it be better to let the Client poll updates periodically (with a new tcp connection each)?

How do Chat Servers or large Online Games handle this?

+3  A: 

Personally I'd go for a single persistent TCP connection per client to avoid a) the additional work in creating and destroying connections and the additional latency involved in all the TCP packets involved and b) to avoid creating lots of sockets in TIME_WAIT on either the clients or the server. There's simply no good reason to create and destroy the connections.

Depending on your platform there may be various tricks to deal with the various platform specific problems that you might get when you have lots of connections open, and by lots I mean 10s of thousands. For example, on Windows, using overlapped I/O and I/O completion ports would be a good design for lots of connections and if your connections are generally idle most of the time then you might find that using the 'zero byte read' trick would allow you to handle more connections on lesser hardware; but it's something you can add once you know you have a problem due to the amount of buffer space that you have waiting for reads which only complete infrequently.

I wouldn't have the clients polling the server. It's inefficient. Have the server publish data to the clients as and when there is data available. This would allow the server to control the workload somewhat by letting it decide how often to send the data to the clients - it could either send every time new data became available for a client or send after it had batched up some data and waited a short while, etc. If the server is pushing the data then the server (the weak point, the place that might get overwhelmed by client demand) has more control over the work that it will need to do.

If you have each client polling then a) you're generating more network noise as each client sends a message to ask the server if it has anything that it should send it and b) you're generating more work for the server as it needs to respond to the polls. The server knows when there's data for the client, let it be responsible for telling the clients.

Len Holgate
Agreed. IRC servers maintain a persistent TCP connection to each client, and those have been known to handle well over 100,000 simultaneous clients.
caf