views:

2386

answers:

2

I read that it's supposedly possible to change the I/O scheduler for a particular device on a running kernel by writing to /sys/block/[disk]/queue/scheduler. For example I can see on my system:

anon@anon:~$ cat /sys/block/sda/queue/scheduler 
noop anticipatory deadline [cfq]

that the default is the completely fair queuing scheduler. What I'm wondering is if there is any use in including all four schedulers in my custom kernel. It would seem that there's not much point in having more than one scheduler compiled in unless the kernel is smart enough to select the correct scheduler for the correct hardware, specifically the 'noop' scheduler for flash based drives and one of the others for a traditional hard drive.

Is this the case?

+3  A: 

The aim of having the kernel support different ones is that you can try them out without a reboot; you can then run test workloads through the sytsem, measure performance, and then make that the standard one for your app.

On modern server-grade hardware, only the noop one appears to be at all useful. The others seem slower in my tests.

MarkR
How do you actually change it at runtime?
Robert S. Barnes
noop's performance relative to the other schedulers very much depends on the hardware and the particular load. Out of curiosity, what disks, controllers, and tests were you running?
ephemient
Yeah, noop is good when you have smart RAID controllers and other stuff where it knows more than the kernel about the best access patterns. Deadline isn't bad either.
Zan Lynx
This is purely a learning exercise for me in which I'm trying to configure the smallest and fastest booting kernel possible that provides all the functionality I need on my laptop. I've looked in both "Linux Kernel Development" and "Essential Linux Device Drivers" and haven't found a satisfactory answer to this question, how smart is the kernel at picking a Scheduler at runtime or does it just always use the default unless you manually set it to something else?
Robert S. Barnes
ephemient > that was on DELL PERC controllers, also on DELL Powervault MD3000. It seemed better than the default (CFQ) on both.
MarkR
Ah, so *real* server-class hardware. Yeah, I can imagine that `noop` can perform better than `cfq`, but `deadline` ought to be pretty good as well...
ephemient
+2  A: 

As documented in /usr/src/linux/Documentation/block/switching-sched.txt, the I/O scheduler on any particular block device can be changed at runtime (there may be some latency as the previous scheduler's requests are all flushed before bringing the new scheduler into use, but it can be changed without problems even while the device is under heavy use).

# cat /sys/block/hda/queue/scheduler
noop anticipatory deadline [cfq]
# echo anticipatory > /sys/block/hda/queue/scheduler
# cat /sys/block/hda/queue/scheduler
noop [anticipatory] deadline cfq

Ideally, there would be a single scheduler to satisfy all needs. It doesn't seem to exist yet, so instead, Linux has several I/O schedulers. The kernel often doesn't have enough knowledge to choose the best scheduler for your workload:

  • noop is often the best choice for memory-backed block devices (e.g. ramdisks) and other non-rotational media (flash) where trying to reschedule I/O is a waste of resources
  • as (anticipatory) is conceptually similar to deadline, but with more heuristics that often improve performance (but sometimes can decrease it)
  • deadline is a lightweight scheduler which tries to put a hard limit on latency
  • cfq tries to maintain system-wide fairness of I/O bandwidth

The default was anticipatory for a long time, and it received a lot of tuning. cfq became the default some while ago, as its performance is reasonable and fairness is a good goal for multi-user systems (and even single-user desktops). For some scenarios -- databases are often used as examples, as they tend to already have their own peculiar scheduling and access patterns, and are often the most important service (so who cares about fairness?) -- anticipatory has a long history of being tunable for best performance on these workloads, and deadline very quickly passes all requests through to the underlying device.

ephemient
Great info, thanks! But my basic question still is unanswered, if I plug in a flash drive or my netbook runs off a flash disk as it's main drive is the kernel smart enough to pick noop instead of the default cfq? Or is it completely up to me to do it manually?
Robert S. Barnes
You can configure the kernel to use a different scheduler by default. It would be clever to automatically use `noop` on non-rotational media, but the kernel doesn't have that functionality. It kind of does have detection of non-rotational media, but it's not reliable as some disks misreport themselves, and it's not yet wired up to the I/O scheduler code anyhow.
ephemient