Device Drivers Concluded
Though only a few drivers implement the memory mapping technique, it gives an interesting insight into the Linux system. I introduce memory management and its features, enabling us to play with the console, include memory mapping in drivers, and crash systems...
Since the days of the 80386, the Intel world has supported a technique called virtual addressing. Coming from the Z80 and 68000 world, my first thought about this was: “You can allocate more memory than you have as physical RAM, as some addresses will be associated with portions of your hard disk”.
To be more academic: Every address used by the program to access memory (no matter whether data or program code) will be translated--either into a physical address in the physical RAM or an exception, which is dealt with by the OS in order to give you the memory you required. Sometimes, however, the access to that location in virtual memory reveals that the program is out of order—in this case, the OS should cause a “real” exception (usually SIGSEGV, signal 11).
The smallest unit of address translation is the page, which is 4 kB on Intel architectures and 8 kB on Alpha (defined in asm/page.h).
When trying to understand the process of address resolution, you will enter a whole zoo of page table descriptors, segment descriptors, page table flags and different address spaces. For now, let's just say the linear address is what the program uses; using a segment descriptor, it is turned into a logical address, which is then resolved to a physical address (or a fault) using the paging mechanism. The Linux Kernel Hacker's Guide spends 20 pages on a rather short description of all these beasties, and I see no chance of making a more succinct explanation.
For any understanding of the building, administration, and scope of pages when using Linux, and how the underlying technique—especially of the Intel family—works, you have to read the Linux Kernel Hacker's Guide. It is freely available by ftp from tsx-11.mit.edu in the /pub/linux/docs/LDP/ directory. Though the book is slightly old [that's a gentle understatement—ED], nothing has changed in the internals of the i386, and other processors looks similar (in particular, the Pentium is exactly like a 386).
If you want to learn about page management, either you start reading the nice guide now, or you believe this short and abstract overview:
Every process has a virtual address space implemented by several CPU registers which are changed during context switches (this is the zoo of selectors and page description pointers). By these registers, the CPU accesses all the memory segments it needs.
Multiple levels of translation tables are used to translate the linear address given by the process to the physical address in RAM. The translation tables all reside in memory. They are automatically looked up by the CPU hardware, but they are built and updated by the OS. They are called page descriptor tables. In these tables there is one entry (i.e., a “page descriptor”) for every page in the process's address space—we're talking of the logical addresses, also called virtual addresses.
We concentrate now on a few main aspects of pages as seen by the CPU:
A page may be “present” or not—depending on whether it is present in physical memory or not (if it has been swapped-out, or it is a page which has not yet been loaded). A flag in the page descriptor is used to indicate this status. Access to a non-present page is called a “major” page fault. The fault is handled in Linux by the function do_no_page(), in mm/memory.c. Linux counts page faults for each process in the field maj_flt in struct task_struct.
A page may be write-protected—any attempt to write on the page will cause a fault (called “minor page fault”, handled in do_wp_page() and counted in the min_flt field of struct task_struct).
A page belongs to the address space of one task or several of them; each such task holds a descriptor for the page. “Task” is what microprocessor technicians call a process.
Other important features of pages, as seen by the OS, are:
If multiple processes use the same page of physical memory, they are said to “share” it. The processes hold separate page descriptors for shared page, and the entries can differ—for example, one process can have write permission on the page and another process may not.
A page may be marked as copy on write (grep for COW in kernel sources). If, for example, a process forks, the child will share the data segments with the parent, but both will be write-protected: the pages are shared for reading. As soon as one program writes onto a page, the page is doubled and the writing program gets a new page; the other keeps its old one, with a decremented “share count”. If the share count is already one, no copy is performed when a minor fault happens, and the page is just marked as writable. Copy-on-write minimizes memory usage.
A page may be locked against being swapped out. All kernel modules and the kernel itself reside in locked pages. As you might remember from the last installment, pages which are used for DMA-transfers have to be protected against being swapped out.
Page descriptors may also point to addresses not located in physical RAM, but rather the ROM of certain peripherals, RAM buffers for video cards etc., or PCI buffers. Traditionally, on Intel architectures, the range for the first two groups is from 640 kB to 1024 kB, and the range for the PCI buffers is above high_memory (the top of physical RAM, defined in asm/pgtable.h). The range from 640 KB to 1024 kB not used by Linux, and is tagged as reserved in the mem_map structure. They are the “384k reserved”, appearing in the first kernel message after BogoMips calculation.
Virtual memory allows quite beautiful things like:
Demand-loading a program instead of loading it totally into memory at startup: whenever you start a program, it gets its own virtual address space, which is just associated with some blocks on your filesystem and some space for variables, but the memory is allocated and loading is performed only when you really access the different portions of the program.
Swapping, in case your memory gets tight. This means whenever Linux needs memory for itself or a program and unused memory gets tight, it will try to shrink the buffers for the file systems, try to “forget” pages already allocated for program code that is executed (they can be reloaded from disk at any time anyway), or swap some pages containing user data to the swap partition of the hard disk.
Memory Protection. Each process has its own address space and can't look at memory belonging to other processes.
Memory Mapping: Just declare a portion or the whole of a file you have opened as a part of your memory, by means of a simple function call.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Keeping track of IP address
59 min 42 sec ago - Roll your own dynamic dns
6 hours 13 min ago - Please correct the URL for Salt Stack's web site
9 hours 24 min ago - Android is Linux -- why no better inter-operation
11 hours 39 min ago - Connecting Android device to desktop Linux via USB
12 hours 8 min ago - Find new cell phone and tablet pc
13 hours 6 min ago - Epistle
14 hours 35 min ago - Automatically updating Guest Additions
15 hours 43 min ago - I like your topic on android
16 hours 30 min ago - This is the easiest tutorial
23 hours 6 min ago




Comments
Re: Kernel Korner: Device Drivers Concluded
I want to mmap the high pci memory . The physical address
i can get using pci_resource_start function. Exactly how can i do this??
Alok
Re: Kernel Korner: Device Drivers Concluded
Hello,
If I want to do two different mmap in my driver.
How to differentiate these to mmap calls in the driver?
Arun
differentiating things
To distinguish your two areas either:
a) register two char devices.
b) use distinct offsets to determine which part to map.