...most efficient when checking for
incoming data (asynchronously). Let's
say I have 500 connections. I have 3
scenarios (that I can think of):
Using select() to check FD_SETSIZE
sockets at a time, then iterating over
all of them to receive the data.
(Wouldn't this require two calls to
recv for each socket returned? MSG_PEEK to allocate a buffer then recv() it again which would be the same as #3)
I trust you're carefully constructing your fd set with only the descriptors that are currently connected...? You then iterate over the set and only issue recv() for those that have read or exception/error conditions (the latter difference being between BSD and Windows implementations). While it's ok functionally (and arguably elegant conceptually), in most real-world applications you don't need to peek before recv-ing: even if you're unsure of the message size and know you could peek it from a buffer, you should consider whether you can:
- process the message in chunks (e.g. read whatever's a good unit of work - maybe 8k, process it, then read the next <=8k into the same buffer...)
- read into a buffer that's big enough for most/all messages, and only dynamically allocate more if you find the message is incomplete
Using select() to check one socket at a time. (Wouldn't this also be like #3? It requires the two calls to recv.)
Not good at all. If you stay single-threaded, you'd need to put a 0 timeout value on select and spin like crazy through the listenig and client descriptors. Very wasteful of CPU time, and will vastly degrade latency.
Use recv() with MSG_PEEK one socket at a time, allocate a buffer then call recv() again. Wouldn't this be better because we can skip all the calls to select()? Or is the overhead of one recv() call too much?
(Ignoring that it's better to try to avoid MSG_PEEK) - how would you know which socket to MSG_PEEK or recv() on? Again, if you're single threaded, then either you'd block on the first peek/recv attempt, or you use non-blocking mode and then spin like crazy through all the descriptors hoping a peek/recv will return something. Wasteful.
So, stick to 1 or move to a multithreaded model. For the latter, the simplest approach to begin with is to have the listening thread loop calling accept, and each time accept yields a new client descriptor it should spawn a new thread to handle the connection. These client-connection handling threads can simply block in recv(). That way, the operating system itself does the monitoring and wake-up of threads in response to events, and you can trust that it will be reasonably efficient. While this model sounds easy, you should be aware that multi-threaded programming has lots of other complications - if you're not already familiar with it you may not want to try to learn that at the same time as socket I/O.