Driving One's Own Audio Device
(September 1998)

In this article Alessandro will show the design and implementation of a custom audio device, paying particular attention to the software driver. The driver, as usual, is developed as a kernel module. Even though Linux-2.2 will be out by the time you read this, the software described here works only with Linux-2.0 and the first few decades of 2.1 versions.

by Alessandro Rubini

Figure 1: the actual device

I'm a strange guy, and I want my computers to keep silent (that's why I wrote the ``Visible-bell mini-howto'', where I suggest to perform speakerectomy surgery). On the other hand, I just enjoy playing with the soldering iron to build irrelevant stuff. One of the most irrelevant things I ever conceived is recycling the computer's loudspeaker in a very-low-volume audio device. As you might imagine, the device plugs in the parallel port.

This article describes the driver for such a beastie, as it shows interesting details of the kernel workings while being small enough to be an easy text for almost any readership. A quick descriprion of the hardware is mandatory, but you can safely skip over the first section and jump directly to section called ``Writing Data''.

The software described here, as well as the electrical drawing, is released accorging to the GPL and is available as sad-1.1.tar.gz (Standalone Audio Device).

Part of this work has been sponsored by ``SAD Trasporto Locale'', the bus company of Bolzano (Bozen), Italy. They plan to bring my hardware on their buses and renamed the company to match my package :-)

Figure 2: the hardware schematics

The image is available as PostScript here

The Underlying Hardware

My device plugs in the parallel port, and its schematics are depicted in figure 1, while figure 2 is a photograph of the only specimen ever built (Italian buses will run a different flavour of such stuff, the ``bus for bus'' -- now in ftp://ar.linux.it/pub/people/rubini/b4b-0.64.tar.gz).

I owe the basic idea to Michael Beck, author of the ``pcsndrv'' package; the idea sounds like ``use the parallel data bits to output audio samples''. My own addition spells ``use the interrupt signal to strobe samples at the right pace''. Audio samples must flow at 8kHz and any not-so-ancient computer can substain such an interrupt rate: my almost-ancient development box runs a 33 BogoMips processor and is perfectly happy in playing parallel audio. The interrupt-based approach trades higher quality for increased hardware complexity than needed by Michael's package.

As shown in the schematics, the device is made up of a simple D/A converter built with a few resistors; the signal is then reduced to 1.5V peak-to-peak amplitude and fed to a low-pass filter. The filter I chose is a switched-capicitors device driven by a square wave at ten times the cutoff frequency. The 6142 chip is a dual op-amp with rail-to-rail output, one of several possible choices for a low-power single-supply equipment.

The output signal can be brought to a small loudspeaker, and can be listened to only in complete silence; other environments ask for some form of amplification. My preferred alternative to the amplifier is the oscilloscope, the typical hear-by-seeing approach.

Writing Data

The main role of an audio driver is, usually, pushing data the the audio device. Several kinds of audio devices exist, and the sad driver only implements the /dev/audio flavour: 8-bit samples flowing at a rate of 8kHz. Each data byte that gets written to /dev/audio should be fed to an 8-bit A/D converter; every 125 micro seconds a new data sample must replace the current one.

Timing issues should be managed by the driver, without intervention from the program that writes out the audio data. The tool that helps detaching timing issues from user programs is the output buffer.

In sad, the output buffer is allocated at load time using get_free_pages. The function allocates consecutive pages, a power of two of them; the order argument of the function specifies how many pages are being asked for, and represents a power of two. An order of 1, therefore, represents 2 pages and an order of 3 represents 8 pages. The allocation order of the output buffer is stored in the macro OBUFFER_ORDER, which is 0 in the distributed source file. This accounts for 1 page, which on the x86 processor corresponds to 4kB: half a second worth of data.

The output buffer of sad is a circular buffer; the pointers ohead and otail represent its staring and ending point. All the kernel uses unsigned long values to represent physical addresses, and the same convention is used in sad:

static unsigned long obuffer = 0; 
static unsigned long volatile ohead, otail;

Note that the head and tail variables are declared as volatile to prevent the compiler from caching their value in processor registers. This is an important caution, as the variables will be modified at interrupt time, asynchronously with the rest of the code.

We'll see later on that sad has an input buffer as well; the overall buffer allocation consists of these lines, executed from within init_module:

obuffer = __get_free_pages(GFP_KERNEL,
                OBUFFER_ORDER, 0 /* no dma */);
ohead = otail = obuffer;
ibuffer = __get_free_pages(GFP_KERNEL,
                IBUFFER_ORDER, 0 /* no dma */);
ihead = itail = ibuffer;

if (!ibuffer || !obuffer) { /* allocation failed */
    cleanup_module(); /* use your own function */
    return -ENOMEM;

Any data that a process writes to the device is put in the circular buffer as long as it fits. When the buffer is full, the writing process is put asleep, waiting for some space to be freed.

Since the data samples flow out smoothly, the process will eventually be awoken to complete its write system call. Anyway, a good driver is prepared to deal with users hitting the allmighty control-C, and must deal with SIGINT and other signals.

The following lines are all that's needed to put to sleep and awake the current process, all the magic is hidden in interruptible_sleep_on:

    if (current->signal & ~current->blocked) /* a signal arrived */
      return -ERESTARTSYS; /* tell the fs layer to handle it */
    /* else, loop */
/* the following code writes in the circular buffer */

That's easy. But what are OBUFFER_FREE and OBUFFER_THRESHOLD, then? They are two macros: the former accesses ohead and otail to tell how much free space is there in the buffer; the latter is a simple constant, predefined to 1024, a pseudo-random number. The role of such a threshold is to preserve system resources by avoiding too frequent asleep-awake transitions.

If the threshold was 1, that would mean that the process needs to be awoken as soon as one byte of the buffer is freed, but it would soon be put asleep again. The net result is that the process is always running, consuming processor power and raising the machine load. A threshold of 1k assures that when the process goes to sleep it will sleep for at least one tenth of a second, because it won't be awoken before 1k bytes flow through the audio device. You can try to recompile sad.c with a different threshold value: you'll see how a small value keeps the processor busy, while too big a value can result in jumpy audio. The audio stream becomes jumpy because data continues to flow before the processor schedules execution of the process writing audio data. The more the computer is loaded, the more jumpy audio is likely to happen; if several processes are contending for the processor, the one playing audio might get awake too late. In addition to lowering the threshold, you can cure the problem by increasing the buffer size.

Naturally, the write device method is only half of the story, the other half is performed by the interrupt handler.

The interrupt handler

In sad, audio samples are strobed out by an hardware interrupt, which is reported to the processor every 125 microseconds. Each interrupt gets services by an ISR (Interrupt Service Routine, also called ``interrupt handler''), written in C. I won't go into the details of registering interrupt handlers here, as they have already been described in other kernel-korner issues.

Managing several thousand interrupts per second is a non-negligible load for the processor (at least for slow processors like mine), so the driver only enables interrupt reporting when the device is opened, and disables it on the last close.

What I'd like to show here is how data flows to the A/D converter. The code is quite easy, and the OBUFFER_THRESHOLD constant appears again, as expected:

if (!OBUFFER_EMPTY) { /* send out one byte */
    OUTBYTE(*((u8 *)otail++));
    if (otail == obuffer + OBUFFER_SIZE) /* wrap */
        otail = obuffer;

As usual, every code snippet introduces new questions; this time you might wonder what OUTBYTE and closeq. The queue is the main topic of the next section, while OUTBYTE hides the line of code that pushes a data sample to the D/A converter. The macro is defined earlier in sad.c as follows:

#define OUTBYTE(b)  outb(convert(b),sad_base)

Here, sad_base is the processor port used to send data to the parallel interface (usually 0x378), and convert is a simple mathematica conversion that turns the data byte as stored in the audio file-format to a linear 0-255 value, more suited to the D/A converter.

Blocking Close

The close system call, like read and write is one of those calls that can block. For example, when you are done with the floppy drive, close blocks waiting for any data to be flushed to the actual device. This behaviour can be verified by running ``strace cp /boot/vmlinux /dev/fd0'').

Audio devices are somehow similar to the floppy drive: a program writing audio data closes the file just after the last write system call. However, this means that data has been transfered to the output buffer, not that everything has already flown to the loudspeaker. An implementation like blocking on close can be helpful in this context; this way you can ``cat file.au ">" /dev/sad && echo done''. On the other hand, sometimes you'll prefer to just stop playing sounds when the process closes the device; for example if you play the piano on your keyboard, the sound should stop as soon as you rais the key, even if the program has already pushed extra data to the output buffer.

For this reason, the sad module implements two device entry points, one that blocks on close and one that doesn't block. Minor number 0 is the blocking device and minor number 1 is the non-blocking one. The entry points in /dev are created by the script that loads the module, included in the saddistribution.

While real device drivers often offer configuration options (like the bhaviour on close) through the ioctl system call, I chose to offer different entry points in /dev because this way I can use normal shell redirection to perform my tasks, without the need to write C code that performs the relevant ioctl calls. The close method in sad.c, therefore, looks like the following:

if (MINOR(inode->i_rdev)==0) /* wait */
else {
    unsigned long flags; /* drop data */
    cli(); ohead=otail;

if (!MOD_IN_USE)
    SAD_IRQOFF(); /* disable irq */

Actually, there is a third possibility as far as close is concerned: go on playing in the background as long as some data is there, even if the program closes the audio device and. This approach is left as an exercise to the reader, because I prefer having a chance to actively stop anything that makes noise.

Reading Data

A device, usually, can be read from as well as written to. Reading a /dev/audio usually returns digitized data from a microphone, but I haven't been asked to provide this feature, and I've no real interest in hearing my voice.

When I built my first alpha of the physical device, I found the need to time the interrupt rate, to make sure it was close enough to the expected 8kHz (in the alpha version, actually, I used a variable resistor to fine-tune the frequency and I needed some way to check how it went). The easiest solution that came to my mind was using the clock of the host computer to measure the time lapses.

To this aim, I modified the interrupt handler so that it would write timestamps to an input buffer whenever the device was being read from. The input buffer is a circular buffer just like the output buffer described above.

The previous excerpt from sad_interrupt showed that after writing an audio sample the function returns to the caller. Any additional lines, therefore, is only executed if no audio is there; the rest of the ISR has thus been devoted to collect timing information. This shows how I implemented ``if there is no pending output deal with input'' rather than the more correct ``if someone is reading give it some data''. This is acceptable as long as the device is not meant to be read from and written to at the same time in a production environment.

static struct timeval tv, tv_last;
unsigned long diff;

diff = (tv.tv_sec  - tv_last.tv_sec) * 1000000 +
       (tv.tv_usec - tv_last.tv_usec);
tv_last = tv;

/* Write 16 bytes, assume bufsize is a multiple of 16 */
ihead += sprintf((char *)ihead,"%15u\n", (int)diff);
if (ihead == ibuffer + IBUFFER_SIZE)
    ihead = ibuffer; /* wrap */
wake_up_interruptible(&inq); /* anyone reading? */

Printing the time difference between two samples has a pair of advantages over printing the absolute time: data is directly meaningful to humans without resorting to external filters, and any overflow of the input buffer will have no effect on the perceived results but the loss of a few samples.

Real tests show that the reported interrupt rate is not steady as one would hope. Some system activities require to disable interrupt reporting, and this introduces some delay in the execution of the ISR routine. An oscillation of a few microseconds is however perfectly acceptable and it is not perceived in the resulting audio, which is not Hi-Fi anyways.

It's interesting to note that disk activity can introduce some real distortion in the audio stream, as servicing an IDE interrupt can take as long as 2 milliseconds (on my sytem). The IDE driver disables interrupt reporting while its own ISR is active, and the huge delay result in 8 lost interrupts from the parallel port, which in turn caused a distortion of the audio data stream.

If you read from sad during disk activity you'll see the long time intervals, while writing to the device produces very bad audio. The easy solution to this problem is invoking ``/sbin/hdparm -u 1 /dev/hda'' before playing any audio. The command instructs the disk drive not to disable reporting interrupts while it is servicing its own. Refer to the hdparm documentation to probe further.

Other Device Methods

The device driver interface offers other device methods in addition to the open/close and read/write pairs. While none of them is critical to device operation, I usually spend a few lines of code to implement select and lseek. The former is needed by those programs that multiplex several input/output channels or use non-blocking operations to read and write data; its role is rather important if you run real programs and the implementation is straightforward enough that I won't show it here. The implementation of lseek, on the other hand, consists of the one line: ``return -ESPIPE;'', and is meant to tell any program that tries to lseek the device that this ``is a pipe'' (reported to user space as ``Illegal seek'').

Related Stuff

My aversion fowards computer sound makes me a real illiterate in the field, and I really don't know anything about programs that play audio, or sites where audio files can be retrieved from. Although Linus Torvalds offered an interesing ``I pronounce Linux as Linux'', the file was not enough to test my device, and I was in the need to generate some audio data. The net result of my ignorance is that the sad distribution includes a program that plays sinusoidal waves, one that plays sqare waves and a not-so-good piano implementation. These tools work with any /dev/audio that you happen to run, and can be fun to play with, especially if you have a scope near your Linux box.

Alessandro tries to develop open-source software/hardware for a living, and that's why he and other hackers founded ``Prosa Srl''. He can be reached as rubini@prosa.it, in addition to the usual addresses @linux.it, and @gnu.org.
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved

Reprinted with permission of Linux Journal