views:

148

answers:

6

Hi,

I am troubled with the following concept: Most books/docs describe how robust servers are multithreaded and that the most common approach is to start a new thread to serve each new client. E.g. a thread is dedicated to each new connection. But how is this actually implemented in big systems? If we have a server that accepts requests from 100000 clients, it has started 100000 threads? Is this realistic? Aren't there limits on how many threads can run in a server? Additionally the overhead of context switching and synchronization, doesn't it degrade performance? Is it implemented as a mix of queues and threads? In this case is the number of queues fixed? Can anybody enlighten me on this, and perhaps give me a good reference that describes these?

Thanks!

+3  A: 

The common method is to use thread pools. A thread pool is a collection of already created threads. When a new request gets to the server it is assigned a spare thread from the pool. When the request is handled, the thread is returned to the pool.

The number of threads in a pool is configured depending on the characteristics of the application. For example, if you have an application that is CPU bound you will not want too many threads since context switches will decrease performance. On the other hand, if you have a DB or IO bound application you want more threads since much time is spent waiting. Hence, more threads will utilize the CPU better.

Google "thread pools" and you will for sure find much to read about the concept.

Kristoffer E
@Kristoffer: Looking at thread pools, my understanding is that the number of threads in the pool should always be relative to the number of available processors. If this is so, then the number of threads in the pool is always low in relation to the number of requests. If my understanding is correct then in my example in a machine with let's say 48 core processors, only 48 threads should be pooled and the rest requests should be queued? Am I mixing the concepts?
You should always take the available number of processors into account when deciding the number of threads you need. However, you also need to account for the kind of work your threads do. Lets take two examples.1) Your threads only do computations that require the CPU, no IO or other waits. Then it would make sense to have the same number of threads as CPUs since that would avoid wasting CPU cycles on being idle while minimizing context switches.2) Your threads spend 50% of time waiting for external systems. Then it would make sense to have 2x<num CPUs> threads to utilize the CPUs fully.
Kristoffer E
@Kristoffer:Ok, this is what I understood from thread pool. But in an example of a high number of requests (e.g. 10000), if the number of threads in the pool is in correlation with the number of available threads, that means that a very high percentage of the requests will be queued all time?
Yes, then they will be queued (or maybe even just denied if such a policy is setup). However, if you expect 10000 simultaneous requests and have say 50 threads to handle them you probably need a better system if you need to handle the requests timely.
Kristoffer E
@Kristoffer: So what is a "better system" as you say defined?Is it defined in terms of hardware used? Since in software besides using threads to achieve concurrency what more can one do?
Yes, better hardware is exactly what I mean.
Kristoffer E
+1  A: 

In most systems a thread pool is used. This is a pool of available threads that wait for incoming requests. The number of threads can grow to a configured maximum number, depending on the number of simultaneous requests that come in and the characteristics of the application.

If a requests arrives, an unoccupied thread is requested from the thread pool. This thread is then dedicated to handling the request until the request finishes. When that happens, the thread is returned to the thread pool to handle another request.

Since there is only a limited number of threads, in most server systems one should attempt to make the lifetime of requests as short as possible. The less time a request needs to execute, the sooner a thread can be reused for a new request.

If requests come in while all threads are occupied, most servers implement a queueing mechanism for requests. Of course the size of the queue is also limited, so when more requests arrive than can be queued, new requests will be denied.

One other reason for having a thread pool instead of starting threads for each request is that starting a new thread is an expensive operation. It's better to have a number of threads started beforehand and reusing them then starting new threads all the time.

Ronald Wildenberg
Looking at thread pools, my understanding is that the number of threads in the pool should always be relative to the number of available processors. If this is so, then the number of threads in the pool is always low in relation to the number of requests. If my understanding is correct then in my example in a machine with let's say 48 core processors, only 48 threads should be pooled and the rest requests should be queued? Am I mixing the concepts?
One core can support multiple threads. Usually, a thread will not be doing anything because it is waiting for something (IO, another thread, ...). If a CPU core only supports one thread, this core wouldn't have anything to do most of the time. For performance, you'd want to maximize the amount of time a core is doing work so a core supports multiple threads.
Ronald Wildenberg
@Ronald Wildenberg: "One core can support multiple threads" this is what troubles me. If we have only 1 core and multiple threads, then these multiple threads would content for this 1 core. Wouldn't this alone degrade performance? If my understanding from thread pool is correct, then the thread pool should have 1 (core) +1 threads in the pool (i.e. 2 threads) and the rest queued?
But CPU's are very good at that. One machine usually runs multiple processes (even if you only run the OS) that use multiple threads and you don't even notice that. Of course contention can be a problem if you have a lot of CPU-intensive tasks, but that is not the case very often as a lot of the time threads are waiting for something.
Ronald Wildenberg
+2  A: 

Also Read up on the SEDA pattern link , link

oluies
+3  A: 

In addition to the answers above I should notice, that really high-performance servers with many incoming connections attempt not to spawn a thread per each connection but use IO Completion Ports, select() and other asynchronous techniques for working with multiple sockets in one thread. And of course special attention must be paid to ensure that problems with one request or one socket won't block other sockets in the same thread.

Also thread management consumes CPU time, so threads should not be spawned for each connection or each client request.

Eugene Mayevski 'EldoS Corp
+1  A: 

To get network servers to handle lots of concurrent connections there are several approaches (mostly divided up in "one thread per connection" and "several connections per thread" categories), take a look at the C10K page, which is a great resource on this topic, discussing and comparing a lot of approaches and linking to further resources on them.

Luca Longinotti
A: 

Creating 10k threads is not likely to be efficient in most cases, but can be done and would work.

If you needed to serve 10k clients at once, doing so on a single machine would be unlikely but possible.

Depending on the client side implementation, it may be that the 10,000 clients do not need to maintain an open TCP connection - depending on the purpose, the protocol design can greatly improve the efficiency of implementation.

I think the appropriate solution for high scale systems is probably extremely domain-specific, and if you wanted a suggestion you'd have to explain more about your problem domain.

MarkR