Friday, January 2, 2009

Linux I/O Schedulers

The 2.6 Linux kernel includes selectable I/O schedulers. The kernel, the core of the operating system, is responsible for controlling disk access by using kernel I/O scheduling. The basic idea of having different I/O schedulers is to accommodate different I/O usage patterns.

In a naive system, without an I/O scheduler, the kernel would issue each request in the order that it receive them. That would result in an performance nightmare, the heads would have to seek back and forth across the disk for every operation.

An I/O scheduler can use (at least) the following techniques to improve performance:

  • Prioritization - The scheduler prioritizes some requests
  • Request merging - The scheduler merges adjacent requests together to reduce disk seeking
  • Elevator - The scheduler orders requests based on their physical location on the block device, and it basically tries to seek in one direction as much as possible.

The are, currently, four schedulers availabe:
  • Completely Fair Queuing
  • Deadline
  • NOOP
  • Anticipatory

Completely Fair Queuing

Implements requests merging and the elevator. It maintains a scalable per-process I/O queue and attempts to distribute the available I/O bandwidth equally among all I/O requests. As of the 2.6.18 kernel, this is the default scheduler in kernel.org releases.


Deadline

Implements request merging, a one-way elevator. It uses a deadline algorithm to minimize I/O latency for a given I/O request. The scheduler provides near real-time behavior and uses a round robin policy to attempt to be fair among multiple I/O requests and to avoid process starvation. It will aggressively re-order requests to improve I/O performance.


NOOP

Implemts (only) request merging. It is a simple FIFO queue. It assumes performance of the I/O has been or will be optimized at the block device (memory-disk) or with an intelligent Hos Bus Adaptater or externally attached controller.


Anticipatory

It implements request merging, a one-way elevator and attempts some anticipatory reads by holding off a bit after a read batch if it thinks a user is going to ask for more data. Is intended to optimize systems with small or slow disk subsystems. One consequence of using this scheduler can be higher I/O latency.



Bibliography:

No comments:

Post a Comment