Memory allocation in the Linux kernel is complex, because there are significant constraints involved—and different ways of allocating memory have different constraints. This means that anyone writing Linux kernel code needs to understand the various ways of allocating memory, including the tradeoffs involved. This makes for for more efficient use of memory and CPU time—you can specify exactly what you need—but it also makes for more demanding programming.
There are essentially five different ways of allocating memory in the kernel. That's a white lie, but it is close enough to the truth for anyone who needs to read this article to learn about kernel memory allocation. Three (which provide dynamic allocation) are generally useful, and two (which provide static allocation) are deprecated, and are mostly historical artifacts that should not be used. We will discuss the advantages and limitations of the useful ways first, and will only briefly mention the two deprecated ways at the end of this article so that you know what to avoid.
There are a few rules that apply no matter what form of dynamic kernel memory allocation you attempt to do. Whenever you attempt to allocate memory in kernel space, you must be prepared for an allocation error. Always check the value returned from the allocation function, and if it is 0, you will need to handle it cleanly, somehow. User-space code can be terminated with a segmentation violation if it ignores memory allocation errors, but the kernel can easily crash, bringing down the whole system.
There are several common error-handling strategies. One strategy is to attempt to allocate critical memory at the top of a function, where you are less likely to have committed yourself and can more likely return an error cleanly. This is usually the best way to handle the problem.
Another strategy, usually used together with allocation at the top of the function, is to allocate an “easy” amount of memory for the memory management system to provide, and then parcel it out for various purposes during the life of the function, effectively doing its own memory management. Several subsystems in the kernel do this, such as the high-level SCSI drivers and the network code. Both include special memory allocation functions which are only supposed to be used in those subsystem. These are not documented here, under the assumption that documentation for those subsystems should document subsystem-specific memory allocation routines.
Yet another strategy, which will only work if you are not in “critical” sections of code, is to allow the kernel to schedule another process by calling schedule() and then to try again later, when schedule returns. Note that some kinds of allocation are not safe to call even once from within critical code; that will be covered when we discuss the individual functions.
The fundamental rule is not to write algorithms that commit themselves to completing without having been guaranteed the resources they need in order to complete. Memory is one of the scarcest and most commonly needed of the resources that must be guaranteed, and the only way to guarantee that memory will be available is to allocate it.
The kmalloc() function allocates memory at two levels: it uses a “bucket” system to allocate memory in units up to nearly a page (4Kb on the i86) in length, and uses a “buddy” system on lists of different sizes of contiguous chunks of memory to allocate memory in units up to 128Kb (on the i86) in length. Only in recent kernels has it been able to allocate memory in units over 4Kb in length, and allocating large amounts of memory with kmalloc is very likely to fail, especially in low-memory situations, and especially on machines with less memory.
Kmalloc is very flexible, as demonstrated by its calling convention:
void * kmalloc(unsigned int size, int priority);
Note the priority argument: this is what makes kmalloc so flexible; it is possible to use kmalloc in very constrained circumstances such as from an interrupt handler. Interrupt-driven code, or code that cannot be pre-empted, but still needs to allocate memory, can call kmalloc with the GFP_ATOMIC priority. This will be more likely to fail, because it cannot swap or do anything else which would cause implicit or explicit I/O to occur. Code with relaxed requirements, which may legitimately be pre-empted, should instead call kmalloc with the GFP_KERNEL priority. This may cause paging and may cause schedule() to be called, but has a higher chance of success.
In order to dynamically allocate memory that can by accessed via DMA, the GFP_DMA priority should be used. It does stress the memory system, particularily if large amounts of memory are requested, and is quite likely to fail. Try again. It should be noted that GFP_DMA is only likely to fail on system with severe limitations on DMA transfers—such as computers using the common ISA bus. Not all platforms are affected by this problem.
Memory allocated with kmalloc() is freed with kfree() (or kfree_s()).
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- Designing Electronics with Linux
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Validate an E-Mail Address with PHP, the Right Way
- Tech Tip: Really Simple HTTP Server with Python
- Why Python?
- Build a Skype Server for Your Home Phone System
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Reply to comment | Linux Journal
43 min 50 sec ago
- Reply to comment | Linux Journal
1 hour 34 min ago
- Not free anymore
5 hours 35 min ago
9 hours 23 min ago
- Reply to comment | Linux Journal
9 hours 31 min ago
- Understanding the Linux Kernel
11 hours 45 min ago
14 hours 15 min ago
- Kernel Problem
1 day 18 min ago
- BASH script to log IPs on public web server
1 day 4 hours ago
1 day 8 hours ago