views:

59

answers:

2

I ask here since googling leads you on a merry trip around archives with no hint as to what the current state is. If you go by Google, it seems that async IO was all the rage in 2001 to 2003, and by 2006 some stuff like epoll and libaio was turning up; kevent appeared but seems to have disappeared, and as far as I can tell, there is still no good way to mix completion-based and ready-based signaling, async sendfile - is that even possible? - and everything else in a single-threaded event loop.

So please tell me I'm wrong and it's all rosy! - and, importantly, what APIs to use.

How does Linux compare to FreeBSD and other operating systems in this regard?

+1  A: 

Asynchronous disc IO is alive and kicking ... it is actually supported and works reasonably well now, but has significant limitations (but with enough functionality that some of the major users can usefully use it - for example MySQL's Innodb does in the latest version).

Asynchronous disc IO is the ability to invoke disc IO operations in a non-blocking manner (in a single thread) and wait for them to complete. This works fine, http://lse.sourceforge.net/io/aio.html has more info.

AIO does enough for a typical application (database server) to be able to use it. AIO is a good alternative to either creating lots of threads doing synchronous IO, or using scatter/gather in the preadv family of system calls which now exist.

It's possible to do a "shopping list" synchronous IO job using the newish preadv call where the kernel will go and get a bunch of pages from different offsets in a file. This is ok as long as you have only one file to read. (NB: Equivalent write function exists).

poll, epoll etc, are just fancy ways of doing select() that suffer from fewer limitations and scalability problems - they may not be able to be mixed with disc aio easily, but in a real-world application, you can probably get around this fairly trivially by using threads (some database servers tend to do these kinds of operations in separate threads anyway). Poll() is good, epoll is better, for large numbers of file descriptors. select() is ok too for small numbers of file descriptors (or specifically, low file descriptor numbers).

MarkR
so where does this stand? http://stackoverflow.com/questions/1825621/how-do-you-use-aio-and-epoll-together-in-a-single-event-loop
Will
(I thought poll and select were symmetric and O(n), whereas epoll is O(1)
Will
poll is better than select because it works a lot better with a sparse set of file descriptors - say you want to poll 10 FDs out of 10000 opened ones, you don't need an array of 10000 entries to be initialised with zeroes. epoll is better because you only need to register the new FDs you're interested in, not pass in all the ones you were already watching.
MarkR
+1  A: 

Most of what I've learned about asynchronous I/O in Linux was by working on the Lighttpd source. It is a single-threaded web server that handles many simultaneous connections, using the what it believes is the best of whatever asynchronous I/O mechanisms are available on the running system. Take a look at the source, it supports Linux, BSD, and (I think) a few other operating systems.

Jonathan