Device Drivers Concluded

This is the last of five articles about character device drivers. In this final section, Georg deals with memory mapping devices, beginning with an overall descriptoin of Linux memory management concepts.
VMA and other Cyberspaces

The guy who has to care for this beautiful stuff is your poor device driver writer. While support for mmap() on files is done by the kernel (by each file system type, indeed), the mapping method for devices has to be directly supported by the drivers, by providing a suitable entry in the fops structure, which we first introduced in the March issue of LJ.

First, we have a look at one of the few “real” implementations for such a support, basing the discussion on the /dev/mem driver. Next, we go on with a particular implementation useful for frame grabbers, lab devices with DMA-support and probably other peripherals.

To begin with, whenever the user calls mmap(), the call will reach do_mmap(), defined in the mm/mmap.c file. do_mmap() does two important things:

  • It checks the permissions for reading and writing the file handle against what was requested to mmap(). Moreover, tests for crossing the 4GB limit on Intel machines and other knock out-criteria are performed.

  • If those are well, a struct vm_area_struct variable is generated for the new piece of virtual memory. Each task can own several of these structures, “virtual memory areas” (VMAs).

VMAs require some explanation: they represent the addresses, methods, permissions and flags of portions of the user address space. Your mmaped region will keep its own vm_area_struct entry in the task head. VMA structures are maintained by the kernel and ordered in balanced tree structures to achieve fast access.

The fields of VMAs are defined in linux/mm.h. The number and content might be explored by looking at /proc/pid/maps for any running process, where pid is the process ID of the requested process. Let's do so for our small nasty program, compiled with gcc-ELF. While the program runs, your /proc/pid/maps table will look somewhat like this (without the comments):

# /dev/sdb2: nasty css
08000000-08001000 rwxp 00000000 08:12 36890
# /dev/sdb2: nasty dss
08001000-08002000 rw-p 00000000 08:12 36890
# bss for nasty
08002000-08008000 rwxp 00000000 00:00 0
# /dev/sda2: /lib/ld-linux.so.1.7.3 css
40000000-40005000 r-xp 00000000 08:02 38908
# /dev/sda2: /lib/ld-linux.so.1.7.3 dss
40005000-40006000 rw-p 00004000 08:02 38908
# bss for ld-linux.so
40006000-40007000 rw-p 00000000 00:00 0
# /dev/sda2: /lib/libc.so.5.2.18 css
40009000-4007f000 rwxp 00000000 08:02 38778
# /dev/sda2: /lib/libc.so.5.2.18 dss
4007f000-40084000 rw-p 00075000 08:02 38778
# bss for libc.so
40084000-400b6000 rw-p 00000000 00:00 0
# /dev/sda2: /dev/mem (our mmap)
400b6000-400c6000 rw-s 000b8000 08:02 32767
# the user stack
bfffe000-c0000000 rwxp fffff000 00:00 0

The first two fields on each line, separated by a dash, represent the address the data is mmaped to. The next field shows the permissions for those pages (r is for read, w is for write, p is for private, and s is for shared). The offset in the file mmaped from is given next, followed by the device and the inode number of the file. The device number represents a mounted (hard) disk (e.g., 03:01 is /dev/hda1, 08:01 is /dev/sda1). The easiest (and slow) way to figure out the file name for the given inode number is:

cd /mount/point
find . -inum inode-number -print

If you try to understand the lines and their comments, please notice that Linux separates data into “code storage segments” or css, sometimes called “text” segments; “data storage segments” or dss, containing initialized data structures; and “block storage segments” or bss, areas for variables that are allocated at execution time and initialized to zero. As no initial values for the variables in the bss have to be loaded from disk, the bss items in the list show no file device (“0” as a major number is NODEV). This shows another usage of mmap: you can pass MAP_ANONYMOUS for the file handle to request portions of free memory for your program. (In fact, some versions of malloc get their memory this way.)

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Kernel Korner: Device Drivers Concluded

Anonymous's picture

I want to mmap the high pci memory . The physical address

i can get using pci_resource_start function. Exactly how can i do this??

Alok

Re: Kernel Korner: Device Drivers Concluded

Anonymous's picture

Hello,

If I want to do two different mmap in my driver.

How to differentiate these to mmap calls in the driver?

Arun

differentiating things

Anonymous's picture

To distinguish your two areas either:
a) register two char devices.
b) use distinct offsets to determine which part to map.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix