Reprinted with permission of Linux Magazine
by Alessandro Rubini
Even though Unix traditionally considers a device as either a "Char Device" or a "Block Device" (as outlined by the ``c'' or ``b'' in their /dev entry points), new classes of device are being introduced as technology advances. One of such classes is that of ``USB devices''.
An USB device is still, at its lowest levels, a ``Char Device'' or a ``Block Device'' (this only if it is a mass-storage device), but the programming interface offered to device driver writers has been simplified and ``factorized'' in order to take advantage of the common hardware features offered by those devices and to offer arbitration of the bus and coordination in its use by the various drivers.
This discussion is based on version 2.3.99-pre3 of the Linux kernel, the one current at time of writing. I expect little of no differences between this version and the 2.4 kernel that sits in front of you as you read. I chose to compile the USB subsystem as modules and discuss the role of individual modules as the modularized form helps understanding the overall design, as compared to a kernel with the whole USB subsystem linked in.
While the ``USB'' acronym means ``Universal Serial Bus'', its physical structure is not a bus. The physical layout is rather a tree, where each transmission link can only connect two nodes: one of them is called the "upstream" node and the other is called the "downstream" node. To make the distinction clearer and avoiding endless cabling problems like those we experience daily with serial interfaces, the USB specification requires different connectors at the upstream and the downstream end of the cables. What you see lurking from the back of your computer is not ``the USB connector'' but rather ``the upstream USB connector'', also called type ``A''.
Each non-leaf node in the tree is an USB hub. The host controller (the hardware installed in the computer set) implements an USB hub, as this is the only way to spit out several communication channels towards USB devices. To preserve the tree structure, it is not possible for a device to have two upstream connections -- while there exist USB devices that can interconnect two computers by connecting to two different USB busses, those devices are conceptually two separate USB peripherals, each connected to its own bus. Figure 1 and Figure 2 depict two typical USB environments and their physical structure represented as a tree.
Figure 1: a PC with USB kbd and mouse, and associated USB tree.
The image is available as PostScript here
Figure 2: overly-complex USB tree with hubs etc
The image is available as PostScript here
As far as the host controller is concerned, all the implementations currently fall under one of two classes: ``Open Host Controller Interface'' (OHCI) and ``Universal Host Controller Interface'' (UHCI). In Linux you can thus choose between two device drivers for your USB subsystem: usb-ohci.o and usb-uhci.o; whatever hardware you have, the module usbcore.o is required in order to load the hardware driver.
When a device is plugged in the bus, it identifies itself as one of several classes of peripherals; in order to use the device you'll need to load a driver that reclaims ownership of that device and handles the associated communication protocol.
The USB protocol allows for a variety of device types and bandwidth usage. To keep the discussion simple, I'll only stick to simple input devices, like mice and keyboards. Those devices don't need a sustained data rate or strict timing, so are the easiest to discuss as well as the most renown to the general public.
When you compile the USB Linux subsystem as modules, you'll end up with several kernel modules. While the internal communication among them is somehow complex, the dependencies of the modules are quite straightforward, as depicted in table 1 and figure 3.
|module||symbols exported by the module||symbols used at load and run time|
As shown, the usbcore module offers all the software infrastructure needed for hardware handling while input offers a generic input management framework; all other modules stack on those two, registering as either producers or consumers of information.
As far as hardware is concerned, the host controller device driver (ohci or uhci) registers its functionality within usbcore's data structures, and the same do the device-specific drivers (for example, usbmouse and usbkbd); their role is however different, as the host controller produces information (collected from the wire) while the other drivers consume information (by decoding the data and pushing it to its final destination).
Input management operates in a similar way: hardware drivers (usbmouse and usbkbd) produce data and the generic input handlers (mousedev and keybdev) consume data. Both kinds of modules stack on input.o, registering their own entry points.
The question still unanswered is how can a module push information to another module by only registering a callback. If you are confident with generic kernel drivers, you'll remember that things are usually set up the other way round: the consumer module registers its callback to be invoked whenever there is new data to consume. The USB driver is laid out differently because it is designed as a message-passing environment, instead of being a data-flow system like most of the Linux kernel. The key data structure, the ``message'', is called ``URB'', short for ``USB Request Block''.
When a USB hardware driver is loaded, it calls
usb_alloc_bus() to register its own
usb_operations data structure with usbcore.o (the
data structure is defined in
every other header material relevant to this article). The operations
thus registered include
unlink_urb, so the software layer can commit data
transfer to the hardware driver.
With the software layer and the hardware drivers in place, a
specific peripheral driver (like usbmouse.o) can register its
usb_driver data structure by calling
usb_register(). The data structure includes pointers to a
probe and a
disconnect function. The former
is called by the USB framework whenever a new device is detected and
the latter whenever the device is unplugged from the bus.
When the probe succeeds (i.e., the new USB device can be handled by
this device driver), the function submits its URB structure to the USB
engine (by calling
usb_submit_urb()), including in the
URB a pointer to its ``completion handler'', a callback that will be
invoked at the end of each hardware transaction.
usb_submit_urb, the URB is passed back to the
host controller interface, so that the hardware driver can fill the
URB buffers with relevant information and call the proper completion
handler when a data transfer happens.
As of 2.3.99-pre3, the module usage count is behaving in a peculiar way. Even though the USB device driver (such as usbmouse.o) is notified when the peripheral device is connected to the bus, its usage count is never incremented, and you could remove the device driver module whenever you want.
While this description referred to mice and keyboards, the same design rules apply to other USB peripheral devices. The difference is mainly in how hardware handles the individual transactions: not every device works like keyboards and mice, a digital camera for example needs to push a sustained data rate into the bus. The USB specification describes a few different transaction types, and all of them are handled by the USB Linux subsystem using URBs.
Although the issue is not directly related to USB hardware, I think
it's interesting to see how USB input devices, like keyboards and
mice, are designed as part of a more generic input mechanism,
implemented in the source file
As already outlined (and shown in table 1), the USB modules for keyboards and mice stack on the input module as ``producers'' of data, while more generic modules (mousedev and keybdev) stack on input as consumers of data.
Figure 3: Dependencies among USB modules
The image is available as PostScript here
Whenever an USB input driver's completion handler is invoked, it
pushes the data just received to the input management engine, by
A ``consumer'' module registers its own callbacks within
input.o, by calling input_register_handler() at load
input_handler data structure includes pointers
to three callbacks:
event, all that's needed for proper device management.
What actually consumes input data is the
The role of
input_event(), as called by the completion
function of the peripheral USB driver, is that of distributing the
input data to all the registered handlers. Each handler will just
ignore data it is not interested in. Thus, if you didn't load
keybdev.o, your USB keyboard will be ignored by the system,
even though you loaded the usbkbd module and key-presses are
correctly decoded and passed to the input engine.
The input handling machinery is very interesting, in my opinion, as it allows insertion of custom kernel modules in the event chain, both as producers and consumers of input events. A new event producer could push keyboard or mouse events down the system's chain even though it is not physically a keyboard, while a new event consumer can associate special actions (any action) to input events. That's a flexible tool for people who play with non-standard devices or need to implement non-standard behaviors in their kernels.
http://www.usb.org carries the USB speciification and other interesing information about USB, including general software issues
On a Linux system with kernel sources installed the directory /usr/src/linux/Documentation/usb documents device-specific questions. The file "URB.txt" is particularly interesting as it thoroughly describes the URB data structure
http://LinuxUSBGuide.sourceforge.net the "linux-USB howto". A lot of information on setting up and using USB under Linux.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved