views:

154

answers:

8

I am aware of event driven I/O like select, poll, epoll, etc allow someone to build say a highly scalable web server, but I am confused by the details. If there is only one thread of execution and one process running for the server, then when the server is running its "processing" routine for the ready clients, isn't this done in a serial fashion to process the list of ready clients since it can't be scheduled on multiple cores or cpus? Moreover, when this processing is happening...wouldn't the server be unresponsive?

I used to think this was the reason people used thread pools to handle the event I/O on the backend, but I was confused when I heard recently that not everybody uses thread pools for their applications.

A: 

The key thing to remember here is that only one thread can execute on the CPU at one time, but I/O doesn't always need the CPU. When a thread blocks for I/O, the CPU is released so that another thread can execute. Also, even on a single CPU box, multiple threads can often do I/O at the same time (depending on the disk system in use).

Eric Petroelje
A: 

When an event "triggers" a signal is raised that stops the current execution and performs the signal handlers code.

Quite often this signal handling code will spawn a new thread/process then return (you'll see implementations using process forks instead of threads sometimes).

Bottom line is that without multiple threads you can have the illusion of parallel execution, but it is really just stopping and starting the main code then dealing with the signal handlers.

Visual Basic has things like DoEvents for instance that will allow other event handlers to perform their actions. This is commonly used as a form of pre-emption before major work (or on each iteration of a loop) to allow the GUI to update (or in your case of a web server for a client request to start to be handled) in between any other work.

Another method that may help is asynchronous I/O that will raise a signal when the transfer is done (or just processed x amount), all in a single thread of execution. Although you'll have to hope that the asynchronous I/O libraries you are using support multi-core processing (or the underlying operating system) in order to get the benefit of multiple cores in this kind of scenario.

Metalshark
+2  A: 

Some of that wisdom predates general availability of multi-core systems. In a multitasking environment, that's still true. Only except for your portable electronics, most of the machines you touch are multiprocessing these days. And even that may not hold for long.

In a pure multi-tasking system, all the OS does is hop from one job to another as they become runnable (unblocked). Event driven and non-blocking IO just do the same thing in userspace.

For certain tasks, it can still aid multiprocessing. By reducing thread yields and mutually exclusive code, more processors can run the application for more clock cycles.

For instance, in an IDE you don't want it constantly scanning the filesystem for external changes. If you've been around long, you've probably run into that before and it's irritating/unproductive. It wastes resources and causes global data models to become locked/unresponsive during updates. Setting an IO Event listener ('watch' on the directory) frees the application to do other things, like helping you write code.

Jason
I'd say "Event driven and non-blocking IO just do the same thing in userspace, typically with much less overhead, so much more efficiently."
ninjalj
The adjective varies with the Operating System, in my experience. On linux, for instance, I trust the threading and disk IO to not stab me in the neck. Windows, not so much. Especially once you throw some idiotic virus scanner into the loop.
Jason
Well, in my experience, on Linux, event-driven beats thread-per-client, but it probably is not noticeable on most scenarios. I was thinking of protothreads and similar things when I added "typically".
ninjalj
+2  A: 

The idea is that the processing thread doesn't have to wait for an entire client conversation to complete before it can service another. For many server applications, most of the server's time is spent waiting on IO. Even though there is only a single thread handling all the requests, the amount of latency added is small because the server was spending most of its time waiting on IO anyway and in this arrangement waiting on IO doesn't prevent the server from responding to another request. This arrangement doesn't really help with the server has to do large amounts of CPU limited processing.

A more scalable setup would combine both asynch IO and multiple threads, ideally having one work thread per execution unit availaible and not spending any time sleeping on IO unless there is no work to do.

Chris Smith
+3  A: 

Hmmm. You (the original poster) and the other answers are, I think, coming at this backwards.

You seem to grasp the event-driven part, but are getting hung up on what happens after an event fires.

The key thing to understand is that a web server generally spends very little time "processing" a request, and a whole lot of time waiting for disk and network I/O.

When a request comes in, there are generally one of two things that the server needs to do. Either load a file and send it to the client, or pass the request to something else (classically, a CGI script, these days FastCGI is more common for obvious reasons).

In either case, the server's job is computationally minimal, it's just a middle-man between the client and the disk or "something else".

That's why these servers use what is called non-blocking I/O.

The exact mechanisms vary from one operating system to another, but the key point is that a read or write request always returns instantly (or near enough). When you try to write, for example, to a socket, the system either immediately accepts what it can into a buffer, or returns something like an EWOULDBLOCK error letting you know it can't take more data right now.

Once the write has been "accepted", the program can make a note of the state of the connection (e.g. "5000 of 10000 bytes sent" or something) and move on to the next connection which is ready for action, coming back to the first after the system is ready to take more data.

This is unlike a normal blocking socket where a big write request could block for quite a while as the OS tries to send data over the network to the client.

In a sense, this isn't really different from what you might do with threaded I/O, but it has much reduced overhead in the form of memory, context switching, and general "housekeeping", and takes maximum advantage of what operating systems do best (or are supposed to, anyway): handle I/O quickly.

As for multi-processor/multi-core systems, the same principles apply. This style of server is still very efficient on each individual CPU. You just need one that will fork multiple instances of itself to take advantage of the additional processors.

Nicholas Knight
A: 

The key to most illusions in life is speed, think about it. The illusion of multiprocessing has been around since before multi-core processors. The idea being if the one processor switches between processes fast enough you wouldn't notice (until the physical hardware runs into problems). If we start from that you'll see that by coupling it with a trick like Asynchronous I/O you can simulate parallel/multi processing.

Dark Star1
A: 
ninjalj
+2  A: 

You usually have a few choices, given how common operating systems, their APIs and the typical programming languages work:

  • 1 thread/process per client. The programming model is easy, but it doesn't scale. On most OSs, switching between thousands of threads is inefficient

  • Use some multiplexing I/O facility - that's select/poll/epoll/etc. on unixes, some more efficient than others. The programmin model is harder, in some cases very hard if you need to deal with blocking opererations as part of the work you do (e.g. call a database, or even read a file from the filesystem) but it can scale a lot better than having 1 thread serve 1 client.

  • A hybrid approach, you use multiplexed IO and have worker threads. A handful of threads dealing with I/O, a handful of threads doing the actual work and you tune the number of threads in each based on what you're doing. This is the most scalable, and usually the hardest to program.

What you choose is basically a tradeoff. It doesn't matter if you're doing stuff in a serial fashion if it's done fast enough already. And if you don't need to scale, and you'll ever need to handle a few dozen or perhaps hundres of non-busy clients, using the easiest approach makes sense. If your application can easily handle 10 times the current load in one thread with multiplexed IO, you don't need to go through the trouble and implement worker threads, etc.

If your server really is busy, then yes - it will appear unresponsive. But CPUs are fast, you can literrally do millions of things within a second. So if you're doing multiplexed IO, you don't spend time waiting for stuff, you spend all your time doing actual work and if performing that work can be done in a few miliseconds, you can server a lot of clients with a single thread. The OS services your app uses, e.g. taking care of network IO can freely take advantage of other cores.

nos