views:

218

answers:

5

It's hard for me to go into exact detail on what the server needs to do (due to NDAs and what not), but it should be sufficient to say that it needs to handle a lightweight binary protocol with many concurrent connected users, ~20.000 is where we have a pretty decent estimate.

Note that clients won't be sending/receiving constantly, but I need to keep the socket open because when the client needs a response we need it as fast as possible and don't have time for the overhead of opening a new connection every time.

The protocol is very lightweight, so there won't be a lot of data going over the wire - the main problem is keeping ~20.000 sockets open at the same time.

(I'm aware that the specifications are a little fuzzy, but I really can't go into more detail)

I have a pretty decent idea what of what I need to do and what type of hardware we need for the server(s) but I figured I'd ask here for existing projects, technologies, languages (Erlang for example), etc. that could assist me in building this.

How can this be achieved?

+1  A: 

Erlang with its lightweight threads and awesome binary handling will make it a great fit. As far as the hardware goes, I can't see that you will need an extremely expensive server if the protocol is very lightweight, but that would depend on other processing that needs to be done after the packet have been received.

Edit

If you need to do data lookups by index or something Mnesia is also greater and supports both in memory and disk based storage and is fully distributed if you end up needing to move to more servers

Some real world info on Erlangs connection handling capabilities http://www.sics.se/~joe/apachevsyaws.html

monkey_p
Yes my first thought was to go to Erlang, but I've only dabbled in it briefly (basically read/worked through the book from prag. prog. but not done anything else in it)I already know a handful of programming languages pretty good and do a lot of work in F# so the leap to making a real server in Erlang might not be that far?
thr
Erlang and F# is very similar, I'm not to familiar with the binary handling of F#, but for Erlang it is quick to pick up and probable the best binary handling language I have ever used (as far as my opinion goes :) )
monkey_p
+3  A: 

If you don't have to go through a firewall, consider using a protocol based on UDP. NFS is a good example of a UDP-based protocol. UDP doesn't have the setup overhead of TCP and can scale to more than 65k concurrent connections. However, if you need guaranteed delivery you will have to build this functionality into the application.

For performance with large user bases, consider using a server architecture based on non-blocking I/O.

Another item that might be worth looking at is Douglas Schmidt's Adaptive Communications Environment (ACE). It's a mature C++ framework for building high performance servers, mainly aimed at telecommunications applications. It supports a variety of threading models and handles most of the tricky stuff for you. You might find that the time spent up front learning how to drive it would be saved down the track in reduced debugging effort on messy synchronisation issues.

ConcernedOfTunbridgeWells
That non-blocking I/O is the way to go I already realized, I hadn't even thought about UDP though, thanks for that idea!And yes I need guaranteed delivery, but I suppose that is "rather" easy to build into the app itself.Very good idea about UDP, thanks a bunch!
thr
on an intranet with low traffic, UDP allmost never has dropped packages. Package loss only occurs when there's too much TCP-IP traffic going on, routers and the OS then choose to drop UDP packages first
Toad
Building the retransmission stuff into UDP is pretty simple and can be handled in the design of your protocol. You're still going to need to handle N concurrent operations (where N is the number of clients sending/receiving at the same time). So you'll need a good lightweight threading library, e.g. pthreads (would be unmanaged though).
badbod99
+1  A: 

Maintaining 20,000 connected sockets is not a problem. You can do it using C on Windows (Server) rather easily as long as you use I/O completion ports and/or the threadpool APIs.

The real prblem I guess is generating the data for those 20,000 connections. That might require some exotic solutions - Erlang or whatever. But the socket side of things is not trivial, but well within the bounds of traditional service design.

Chris Becke
The data is *very very very* slim and will pretty much already be calculated, the clients need to pull different parts of it (indexes in a huge array) basically. Not a lot of computation going on at all.The protocol is *extremely* lightweight, including the data. The issue was keeping 20.000 connected clients at the same time.
thr
+2  A: 

Take a look at the CCR from the robotics guys at Microsoft. I enables you to do Erlang type programming (message passing, queues, etc), but just using c# and not a totally new functional language.

Furthermore it is able to make use of the asynchronous programming model where you don't need dozens of threads in threadpools to do your stuff. It's much faster and gives really elegant code.

I'm using it myself for an SMS server which needs to spit out SMS's at ridiculous speeds, and it does so without stressing the CPU at all

Toad
Cool! Thanks for that tip, we do mostly .NET programming in C# and F# here so if we could stick with a platform that everyone already knows it would save a great deal of headache for a lot of us!
thr
best thing is that MS realized the CCR (and DSS) part is really good, they cut it out of the robotics studio. It is now a separate and free download
Toad
ah...let me rephrase this a bit. It is free for non commercial use, but you do need to buy a license if you want to redistribute your application with it
Toad
OK, well we don't need to redistribute the CCR with the application but the use isn't non-commercial either. Will have to look into it, thanks a lot anyway! :)
thr
A: 

You don't need to support 20K concurrent users on a single server. Load balance between three or four, and have them connect to the back end database if you're doing any database work; perhaps throw in memcache for good measure, depending on what app you're building.

Blekk
Why not buy 20000 rigs that only deal with one connection each? Silly answer. -1
spender
I don't see it as a silly answer. It's a legitimate question: the OP has asked for a solution that supports 20k concurrent connections, but does it really need to be a single server?
hbunny