views:

83

answers:

3

In Linux, when you make a blocking i/o call like read or accept, what actually happens?

My thoughts: the process get taken out of the run queue, put into a waiting or blocking state on some wait queue. Then when a tcp connection is made (for accept) or the hard drive is ready or something for a file read, a hardware interrupt is raised which lets those processes waiting to wake up and run (in the case of a file read, how does linux know what processes to awaken, as there could be lots of processes waiting on different files?). Or perhaps instead of hardware interrupts, the individual process itself polls to check availability. Not sure, help?

A: 

Read this: http://www.minix3.org/doc/

It's a very, clear, very easy to understand explanation. It generally applies to Linux, also.

S.Lott
A: 

Effectivly the method will only returns when the file is ready to read, when data is on a socket, when a connection has arrived...

To make sure it can return immediatly you probably want to use the Select system call to find a ready file descriptor.

mikek3332002
+2  A: 

Each Linux device seems to be implemented slightly differently, and the preferred way seems to vary every few Linux releases as safer/faster kernel features are added, but generally:

  1. The device driver creates read and write wait queues for a device.
  2. Any process thread wanting to wait for i/o is put on the appropriate wait queue. When an interrupt occurs the handler wakes up one or more waiting threads. (Obviously the threads don't run immediately as we are in interrupt context, but are added to the kernel's scheduling queue).
  3. When scheduled by the kernel the thread checks to see if conditions are right for it to proceed - if not it goes back on the wait queue.

A typical example (slightly simplified):

In the driver at initialisation:

    init_waitqueue_head(&readers_wait_q);

In the read function of a driver:

    if (filp->f_flags & O_NONBLOCK)
    {
        return -EAGAIN;
    }
    if (wait_event_interruptible(&readers_wait_q, read_avail != 0))
    {
        /* signal interrupted the wait, return */
        return -ERESTARTSYS;
    }
    to_copy = min(user_max_read, read_avail);
    copy_to_user(user_buf, read_ptr, to_copy);

Then the interrupt handler just issues:

    wake_up_interruptible(&readers_wait_q);

Note that wait_event_interruptible() is a macro that hides a loop that checks for a condition - read_avail != 0 in this case - and repeatedly adds to the wait queue again if woken when the condition is not true.

As mentioned there are a number of variations - the main one is that if there is potentially a lot of work for the interrupt handler to do then it does the bare minimum itself and defers the rest to a work queue or tasklet (generally known as the "bottom half") and it is this that would wake the waiting threads.

See Linux Device Driver book for more details - pdf available here: http://lwn.net/Kernel/LDD3

Dipstick