views:

245

answers:

3

I've got an application, written in C++, that uses boost::asio. It listens for requests on a socket, and for each request does some CPU-bound work (e.g. no disk or network i/o), and then responds with a response.

This application will run on a multi-core system, so I plan to have (at least) 1 thread per core, to process requests in parallel.

Whats the best approach here? Things to think about:

  • I'll need a fixed size thread pool (e.g. 1 thread per CPU)
  • If more requests arrive than I have threads then they'll need to be queued (maybe in the o/s sockets layer?)

Currently the server is single threaded:

  • It waits for a client request
  • Once it receives a request, it performs the work, and writes the response back, then starts waiting for the next request

Update:

More specifically: what mechanism should I use to ensure that if the server is busy that incoming requests get queued up? What mechanism should I use to distribute incoming requests among the N threads (1 per core)?

+1  A: 

ACE http://www.cs.wustl.edu/~schmidt/ACE/book1/

It has everything you need. Thread management and queues and as an added bonus a portable way of writing Socket Servers.

anio
Thanks for the suggestion, but I'm not going to switch from Boost to ACE at the moment.
Alex Black
@Alex - not sure if mamin was suggesting using ACE or just reading the book. The book is heavily geared toward ACE but covers some important patterns that can be implemented independently of the library.
Duck
@Duck: Ah, thx, I'll take a look at the book.
Alex Black
+1  A: 

I don't see that there is much to consider that you haven't already covered.

If it is truly CPU-bound then adding threads beyond the number of cores doesn't help you much except if you are going to have a lot of requests. In that case the listen queue may or may not meet your needs and it might be better to have some threads to accept the connections and queue them up yourself. Checkout the listen backlog values for your system and experiment a bit with the number of threads.

UPDATE:

listen() has a second parameter that is your requested OS/TCP queue depth. You can set it up to the OS limit. Beyond that you need to play with the system knobs. On my current system it is 128 so it is not huge but not trivial either. Check your system and consider whether you realistically need something larger than the default.

Beyond that there are several directions you can go. Consider KISS - no complexity before it is actually needed. Start off with something simple like just have a thread to accept connection (up to some limit) and plop them in a queue. Worker threads pick them up, process, write result, and close socket.

At the current pace of my distro's Boost updates (and my lack of will to compile it myself) it will be 2012 before I play with ASIO - so I can't help with that.

Newton Falls
It is CPU bound. I'll update my question with more specifics.
Alex Black
Its reasonably easy to build on both windows and linux. I am new to linux, and even I can do it :)
Alex Black
@Alex - you inspired me. The process was not w/o incident but not nearly as bad as the last time around.
Newton Falls
Glad you tried it out Newton, hope it worked out well.
Alex Black
+1  A: 

If you are using the basic_socket_acceptor's overloaded constructor to bind and listen to a given endpoint, it uses SOMAXCONN as the backlog of pending connections in the call to listen(). I think (not very sure) that this maps to 250 in Windows. So the network service provider will (silently) accept client connections up to this limit and queue them for your application to process. Your next accept call will pop a connection from this queue.

Modicom
Thanks. I'm on Linux, is the default max similar there?
Alex Black
Modicom
In current kernels "cat /proc/sys/net/core/somaxconn" to get the current value. "echo 2000 > /proc/sys/net/core/somaxconn" to increase.
Newton Falls
Note those values are for the system so changing them will have effects beyond your program.
Newton Falls
You can also separate the I/O and request processing if your request processing is complex.Have one boost::asio::io_service and associated thread pool for overlapped socket I/O. The size of this thread pool can be -> Number of available processor cores + 1. Have one more boost::asio::io_service and associated thread pool for request processing. On completion of a read of a request over a accepted socket, post the request to the request processing io_service. On the completion of the processing, post the response to the I/O io_service which will then send the response over the socket.
Modicom