An Introduction to Block Driver
When Unix was written 25 years ago, its design was eclectic. One unusual design feature was that every physical device connected to the computer was represented as a file. This was a bold decision, because many devices are very different from one another, especially at first glance. Why use the same interface to talk to a printer as to talk to a disk drive?
The short answer is that while the devices are very much different, they can be thought of as having most of the same characteristics as files. The entire system is then kept smaller and simpler by only using one interface with a few extensions.
This is fine, except that it hides important differences between devices. For example, it is possible to read any byte on a disk at any time, but it is only possible to read the next byte from a terminal.
There are other differences, but this is the most fundamental one: Some devices (like disks) are random-access, and others (like terminals) are sequential-access. Of course, it is possible to pretend that a random-access device is a sequential-access device, but it doesn’t work the other way around.
A practical effect of the difference is that file systems can only be mounted on block devices, not on character ones. For example, most tapes are character devices. It is possible to copy the contents of a raw, quiescent (unmounted and not being modified) file system to a tape, but you will not be able to mount the tape, even though it contains the same information as the disk.
Most textbooks and tutorials start by explaining character devices, the sequential-access ones, because a minimal character device driver is easier to write than a minimal block device driver. My own Linux Kernel Hackers’ Guide (the KHG) is written the same way.
My reason for starting this column with block devices, the random-access devices, is that the KHG explains simple character devices better than it does block devices, and I think that there is a greater need for information on block devices right now. Furthermore, real character device drivers can be quite complex, just as complex as block device drivers, and fewer people know how to write block device drivers.
I am not going to give a complete example of a device driver here. I am going to explain the important parts, and let you discover the rest by examining the Linux source code. Reading this article and the ramdisk driver (drivers/block/ramdisk.c), and possibly some parts of the KHG, should make it possible for you to write a simple, non-interrupt-driven block device driver, good enough to mount a filesystem on. To write an interrupt-driven driver, read drivers/block/hd.c, the AT hard disk driver, and follow along. I’ve included a few hints in this article, as well.
Whereas character device drivers provide procedures for directly reading and writing data from and to the device they drive, block devices do not. Instead, they provide a single request() procedure which is used for both reading and writing. There are generic block_read() and block_write() procedures which know how to call the request() procedure, but all you need to know about those functions is to place a reference to them in the right place, and that will be covered later.
The request() procedure (perhaps surprisingly for a function designed to do I/O) takes no arguments and returns void. Instead of explicit input and return values, it looks at a queue of requests for I/O, and processes the requests one at a time, in order. (The requests have already been sorted by the time the request() function reads the queue.) When it is called, if it is not interrupt-driven, it processes requests for blocks to be read from the device, until it has exhausted all pending requests. (Normally, there will be only one request in the queue, but the request() procedure should check until it is empty. Note that other requests may be added to the queue by other processes while the current request is being processed.)
On the other hand, if the device is interrupt-driven, the request() procedure will usually schedule an interrupt to take place, and then let the interrupt handling procedure call end_request() (more on end_request() later) and then call the request() procedure again to schedule the next request (if any) to be processed.