NetBSD Documentation: Kernel Programming FAQ



What is KNF

KNF stands for "Kernel Normal Form" - it's a C coding style documented in /usr/share/misc/style, which is included in the source tree as src/share/misc/style.

Using the `packed' attribute

Always use the `packed' attribute in structures which describe wire protocol data formats.

Using printf() for debugging

Probably the simplest way of generating debugging information from a kernel driver is to use printf(). The kernel printf will send output to the console, so beware of generating too much output and making the system unusable.

Forcing code to enter DDB

Ensure your kernel config file contains 'options DDB', the file has '#include "opt_ddb.h"', then use 'Debugger()'.

Adding a new driver to the kernel

Every driver needs at least:

  • xxxprobe() ( during which NetBSD will attempt to determine if the device is present)
  • xxxattach() routine which will configure and attach the device.

Once probe and attach routines have been written, add an entry to /usr/src/sys/arch/<your-arch>/<your-arch>/conf.c.

There are two tables:

  • cdevsw for character devices.
  • bdevsw for block devices (for those that also perform "block" I/O and use a strategy routine).

Most entries will be of the form cdev_xxx_init(), which is a macro handling prototyping of the standard Unix device switch routines.

The probe/attach routines are called at boot time. The open(), close(), read(), and write() routines are called when you open up the device special file who's major number corresponds to the index into that table. For example, if you open up a device who's major number is 18, the "open" routine for device number 18 in cdevsw[]/bdevsw will be called.

Most drivers are split between bus specific attach code, and a machine independent core. As an example, the driver for the PCI lance ethernet chip has entries in the following files:

See also the autoconf explanation.

How does all this autoconf stuff work?

The autoconf machinery is quite simple once you figure out the way it works. If you want to ignore the exact details of how the device probe tree is built and walked on runtime, the bits needed for each individual leaf driver are like this:

  1. each driver specifies a structure holding three things - size of its private structure, probe function and attach function; this is compiled in and used in runtime - example:
    struct cfattach foo_baz_ca = {
        sizeof(struct foo_baz_softc), foo_baz_match, foo_baz_attach
  2. on kernel startup, once the time comes to attach the device, autoconf code calls device's probe routine and passes it pointer to parent (struct device *parent), pointer to attach tag structure (void *aux), and appropriate autoconf node (struct cfdata *cf). The driver is expected to find out if it's where it's supposed to be (commonly, the location and configuration information is passed by the attach tag). If yes, the probe routine should return 1. If device is not there, probe routine has to return 0. NO STATE SHOULD BE KEPT in either case.
  3. if probe returned success, autoconf allocates chunk of memory sized as specified in device's *_ca and calls its attach routine, passing it pointer to parent (struct device *parent), pointer to the freshly allocated memory (struct device *self) and the attach tag (void *aux). Driver is expected to find out exact ports and memory, allocate resources and initialize its internal structure accordingly. Preferably, all driver instance specific information should be kept in the allocated memory.

Example: Let's have a PCI ethernet device 'baz', kernel config chunk looks like this:

pci*    at mainbus?
baz*    at pci? dev ? function ?

At runtime, autoconf iterates over all physical devices present on machine's PCI bus. For each physical device, it iterates over all devices registered in kernel to be on pci bus, and calls drivers' probe routine. If any probe routine claims the device by returning 1, autoconf stops iterating and does the job described under 3). Once the attach function returns, autoconf continues with next physical device.

See also Adding a new driver.

Adding a system call

Add an entry in syscalls.master, and add the syscall stub to the appropriate place in src/lib/libc/sys/

See the HOWTO and related documentation in the NetBSD Internals Guide for more information.

Adding a sysctl

See a posting answering this question on tech-kern.

Note that NetBSD 1.6 and up has a special vendor sysctl category that is reserved for vendor specific entries. See sysctl(8) for more information.

How to implement mmap(2) in a pseudo-device

Your device is most likely a character device, so you will be using the device pager (the VM system hides all of this from you, don't worry).

The first thing you need to do is pick some arbitrary offsets for your mmap interface. Something like "mmap offset 0-M gives object A, N-O gives object B", etc.

After that, your mmap routine would look something like this:

foommap(dev_t dev, int off, int prot)

        if (off & PAGE_MASK)

        if ((u_int)off >= FOO_REGION1_MMAP_OFFSET &&
            (u_int)off < (FOO_REGION1_MMAP_OFFSET + FOO_REGION1_SIZE))
                return (atop(FOO_REGION1_ADDR + ((u_int)off -

        if ((u_int)off >= FOO_REGION2_MMAP_OFFSET &&
            (u_int)off < (FOO_REGION2_MMAP_OFFSET + FOO_REGION2_SIZE))
                return (atop(FOO_REGION1_ADDR + ((u_int)off -

        /* Page not found. */
        return (-1);

Now, this is slightly more complicated by the fact that you are going to be mmap'ing what are simply kernel memory objects (it is a pseudo-device after all).

In order to make this work, you're going to want to make sure you allocate the memory objects to be mmap'd on page-aligned boundaries. If you are allocating something >= PAGE_SIZE in size, this is guaranteed. Otherwise, you are going to have to use uvm_km_alloc(), and round your allocation size up to page size.

Then it would look a bit more like this:

foommap(dev_t dev, int off, int prot)
        paddr_t pa;

        if (off & PAGE_MASK)
                panic("foommap: offset not page aligned");

        if ((u_int)off >= FOO_REGION1_MMAP_OFFSET &&
            (u_int)off < (FOO_REGION1_MMAP_OFFSET + FOO_REGION1_SIZE)) {
                if ((vaddr_t)foo_object1 & PAGE_MASK)
                        panic("foommap: foo_object1 not page aligned");
                if (pmap_extract(pmap_kernel(), foo_object1 +
                    (u_int)off - FOO_REGION1_MMAP_OFFSET, &pa) == FALSE)
                        panic("foommap: foo_object1 page not mapped");
                return (atop(pa));

        if ((u_int)off >= FOO_REGION2_MMAP_OFFSET &&
            (u_int)off < (FOO_REGION2_MMAP_OFFSET + FOO_REGION2_SIZE)) {
                if ((vaddr_t)foo_object2 & PAGE_MASK)
                        panic("foommap: foo_object2 not page aligned");
                if (pmap_extract(pmap_kernel(), foo_object2 +
                    (u_int)off - FOO_REGION2_MMAP_OFFSET, &pa) == FALSE)
                        panic("foommap: foo_object2 page not mapped");
                return (atop(pa));

        /* Page not found. */
        return (-1);

Accessing a kernel structure from userland

The canonical example for this is: src/usr.bin/vmstat/dkstats.c , which reads disk statistics.

Is there a simple PCI driver I can use as an example?

You can look at sys/dev/pci/puc.c, which is one of the simplest drivers. PUCs are devices with one or more serial or parallel ports on it, usually using standard chips (e.g. 16550 UART for serial). This driver just locates the I/O addresses of the registers of the serial or parallel controller and passes it to the serial or parallel driver.

Other related links

Back to NetBSD Documentation: Kernel