Kernel Locking Techniques
Big-reader locks (brlocks), defined in include/linux/brlock.h, are a specialized form of reader/writer locks. Big-reader locks, designed by Red Hat's Ingo Molnar, provide a spinning lock that is very fast to acquire for reading but incredibly slow to acquire for writing. Therefore, they are ideal in situations where there are many readers and few writers.
While the behavior of brlocks is different from that of rwlocks, their usage is identical with the lone exception that brlocks are predefined in brlock_indices (see brlock.h):
br_read_lock(BR_MR_LOCK); /* critical region (read only) ... */ br_read_unlock(BR_MR_LOCK);
Use of brlocks is currently confined to a few special cases. Due to the large penalty for exclusive write access, it should probably stay that way.
Linux contains a global kernel lock, kernel_flag, that was originally introduced in kernel 2.0 as the only SMP lock. During 2.2 and 2.4, much work went into removing the global lock from the kernel and replacing it with finer-grained localized locks. Today, the global lock's use is minimal. It still exists, however, and developers need to be aware of it.
The global kernel lock is called the big kernel lock or BKL. It is a spinning lock that is recursive; therefore two consecutive requests for it will not deadlock the process (as they would for a spinlock). Further, a process can sleep and even enter the scheduler while holding the BKL. When a process holding the BKL enters the scheduler, the lock is dropped so other processes can obtain it. These attributes of the BKL helped ease the introduction of SMP during the 2.0 kernel series. Today, however, they should provide plenty of reason not to use the lock.
Use of the big kernel lock is simple. Call lock_kernel() to acquire the lock and unlock_kernel() to release it. The routine kernel_locked() will return nonzero if the lock is held, zero if not. For example:
lock_kernel(); /* critical region ... */ unlock_kernel();
Starting with the 2.5 development kernel (and 2.4 with an available patch), the Linux kernel is fully preemptible. This feature allows processes to be preempted by higher-priority processes, even if the current process is running in the kernel. A preemptible kernel creates many of the synchronization issues of SMP. Thankfully, kernel preemption is synchronized by SMP locks, so most issues are solved automatically by writing SMP-safe code. A few new locking issues, however, are introduced. For example, a lock may not protect per-CPU data because it is implicitly locked (it is safe because it is unique to each CPU) but is needed with kernel preemption.
For these situations, preempt_disable() and the corresponding preempt_enable() have been introduced. These methods are nestable such that for each n preempt_disable() calls, preemption will not be re-enabled until the nth preempt_enable() call. See the “Function Reference” Sidebar for a complete list of preemption-related controls.
Both SMP reliability and scalability in the Linux kernel are improving rapidly. Since SMP was introduced in the 2.0 kernel, each successive kernel revision has improved on the previous by implementing new locking primitives and providing smarter locking semantics by revising locking rules and eliminating global locks in areas of high contention. This trend continues in the 2.5 kernel. The future will certainly hold better performance.
Kernel developers should do their part by writing code that implements smart, sane, proper locking with an eye to both scalability and reliability.

- « first
- ‹ previous
- 1
- 2
- 3
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- info
35 min 36 sec ago - information
38 min 8 sec ago - info
40 min 18 sec ago - Bought photoshop CS5 for developing a website :(
3 hours 52 min ago - What the author describes
5 hours 18 min ago - Reply to comment | Linux Journal
9 hours 29 min ago - Reply to comment | Linux Journal
10 hours 14 min ago - Didn't read
10 hours 24 min ago - Reply to comment | Linux Journal
10 hours 29 min ago - Poul-Henning Kamp: welcome to
12 hours 39 min ago




Comments
linux question i need help plz
Build a bash command cgrep that search the indicated files using color, ignoring cases and showing the line number.
Now if you perform the following command: man cgrep
You get the message: no manual entry for cgrep
You are requested to add a manual for this command.
Hints:
1- Read the manual of man (man man) to understand where manual files are stored.
2- You need to use gunzip and gzip
3- You need to be root to create a manual (sudo -i)
Why don't you ask your
Why don't you ask your instructor instead of posting your homework!!!
Re: Kernel Korner: Kernel Locking Techniques
Good article on Locking mechanism in Linux
Thanks for your updates.
Regards,
Sathish.
tasklets and work queue
Hi,
I have a question regarding tasklet & work-queue. As both are bottomhalf handlers, on which basis we should decide to use tasklets or work queue?
I know that tasklets are running at very higher priority (we can say in interrupt context) than work queue (process context) becuase of which we should
not do any blocking/sleep operation inside tasklets while same can be done in workqueue.
If I want to do IO transcation in the response of interrupt, is it good to use tasklets here?
In real scenario,
I got intterupt from touch screen controller, Now I have to read using I2C interface from controller. Is it safe to read data from tasklets here?
Very nicely explained the
Very nicely explained the locking procedure. Very useful URL.
Recursive semaphore
According to the article, spinlocks are not recursive. What about semaphores?
kernel locking techniques
hi
The above information was excellent and i would like to know from you a small information. Can the kernel be completely locked down for a small period of time.i.e none of the kernel threads should run as my thread is running. i would like opinions in this matter
I believe kernel_lock would
I believe kernel_lock would help... If you are in a user space and need kernel for a time being you can make syscall which when called with some parameter calls kernel_lock and returns and when called with some other parameter calls kernel_unlock...
Very good question. I need
Very good question. I need to do this but can't find out how. Has this question been answered somewhere ? A small period would meen 5..90usecs.
some doubt about sempahore
Thank you for you article,I do learn a lot from that.
But a have a question about semphore.
in you article you mention that up() operation:"if the new value is greater than or equal to zero, one or more tasks on the wait queue will be woken up"
i think it's less than or equal to instead of greater than.
sorry but you are wrong
sorry but you are wrong greater than and equal to is written since, as soon as semaphore count increases it means some objects of resource are free to be allocated to some processes.this makes a process pop out of wait queue and become active.
Take the Spinlocks warning seriously!
"never call any function that touches user memory, kmalloc() with the GFP_KERNEL flag, any semaphore functions or any of the schedule functions while holding a spinlock."
I struggled with a kernel panic for a few days when I was calling the function "copy_to_user" while holding a lock. Call the function a few times a second, and it would work, anything higher than that would simply panic.
Just make sure every function called while holding does not sleep. If it has to, use a semaphore.
Re: Kernel Korner: Kernel Locking Techniques
spin lock works on the beauty that it disables the interupt before entering critical section and enble after exititng.So as it cannot disable interrupt of another process , spinlock is not a solution for SMP system.
Of course it is !
spin_lock() won't disable interrupt, it is used to protect between user contexts.
while spin_lock_irq() will disable interrupt, of course it can be used to protect between user context and interrupt context.
Note: the spin_lock_irq() only disable interrupt on _local_ CPU, what can be guaranteed when they use in SMP ?
The answer is the low level assembly code inside, it takes advantage of "BUS locking scheme" to guarantee other CPU won't intervene the access, thus SMP() safe !!
Re: Kernel Korner: Kernel Locking Techniques
atomic_t v;
atomic_set(&v, 5); /* v = 5 (atomically) */
atomic_add(3, &v); /* v = v + 3 (atomically) */
atomic_dec(&v); /* v = v - 1 (atomically) */
printf("This will print 7: %d ", atomic_read(&v));
How does Robert get this example to work in kernel-space? (Did he mean 'printk' instead of 'printf'?)
atomic_t v; atomic_set(&v,
atomic_t v;
atomic_set(&v, 5); /* v = 5 (atomically) */
atomic_add(3, &v); /* v = v + 3 (atomically) */
atomic_dec(&v); /* v = v - 1 (atomically) */
printf("This will print 7: %d ", atomic_read(&v));
As per my knowledge, atomic operations are atomic only for single function like
atomic_add(3,&v);
but not when executed in sequence. i.e.
atomic_add(3,&v); and
{
atomic_add(1,&v);
atomic_add1(2,&v);
}
are not same.
Re: Kernel Korner: Kernel Locking Techniques
Good Article. I have few questions and I appreciate
if you could give me the answers.
1. Can printks exist between spinlock and spinunlock?
2. I understand that it is not possible to have
copy_from_user and copy_to_user calls between
spinlock and spinunlock. Can these functions be
called between semaphore lock and unlock functions
(up and down)?
3. Can down_trylock function be called between spinlock
and spinunlock.
Thanks in advance
Ravi Kumar
Rendezvous On Chip Ltd
Hyderabad
Re: Kernel Korner: Kernel Locking Techniques
See
http://kernelnewbies.org/documents/kdoc/kernel-locking/sleeping-things.html
Anything can be called inside a semaphore
Re: Kernel Korner: Kernel Locking Techniques
--SJLC
Re: Kernel Korner: Kernel Locking Techniques
Thank you very much.
Ravi Kumar
Re: Kernel Korner: Kernel Locking Techniques
If a process attempts to acquire a spinlock and it is unavailable, the process will keep trying (spinning) until it can acquire the lock.
Does this involve task switching? If so, what is the difference
of spinlock and semphere except spinlock wasting more CPU
time?
Re: Kernel Korner: Kernel Locking Techniques
yes you point out rightly.
Actuallly there is no context switch takes place , that is why it is faster than semaphore. As it does not put the process in the wait state so no swithcing takes place.
Re: Kernel Korner: Kernel Locking Techniques
Does this involve task switching? If so, what is the difference
of spinlock and semphere except spinlock wasting more CPU
time?
---------------------------------------------
No. It just spins until it gets the lock.
If the critical region is short enough that the time spent on
spinning around is shorter than that taken to execute the
semaphore up/down codes, the spinlock wins.
If not, you may choose the semaphore.
Re: Kernel Korner: Kernel Locking Techniques
I am pussled........
If spin_locks are not supposed to be hold where processing of data takes long time, i.e. around copy_to_user() which might block. How can I safely move data from the interrupt handler to the user?
Schenario :
Hardware that gives an interrupt when there is data to read.
Interrupt handler intercepts the interrupt and reads the data from the hardware and places it in a "interrupt buffer". The interrupt buffer is not allocated dynamically, but rather
statistically to ensure that it is newer swaped out.
An application reads the data throught the device driver read method. It should not read from the interrupt buffer
directly because the interrupt handler might be adding to the buffer.
First invalid solution that comes to mind: Place a spin lock
around the interrupt buffer so that we guarantee that either the interupt handler or the device driver read method
are accessing the buffer at any given time. This is forbidden since a spin lock should newer be around data processing that migh block, e.g. copy _to_user().
Second invalid solution that comes to mind: Place a spin lock around the interrupt buffer and a semaphore around a user application buffer which is dynamically allocated and a lot bigger than the interrupt buffer. Then we have the problem of interlocking, i.e. in order to move data from the interupt buffer to the application buffer we have to aquire both semphore and spinlock which is now basically preotecting the application buffer which again might be swapped out, i.e. we have the possibility of blocking while holding a spin_lock.
I have a hard time seeing how you can break this "deadlock" in schenario two because you always end up needing to move the data from interrupt context to application context.
KDD
Re: Kernel Korner: Kernel Locking Techniques
Hardware that gives an interrupt when there is data to read. Interrupt handler intercepts the interrupt and reads the data from the hardware and places it in a "interrupt buffer". The interrupt buffer is not allocated dynamically, but rather
statistically to ensure that it is newer swaped out. An application reads the data throught the device driver read method. It should not read from the interrupt buffer directly because the interrupt handler might be adding to the buffer.
In the read() method:
Grab spin_lock_irq
Copy from interrupt buffer to local buffer
spin_unlock_irq
copy_to_user from local buffer
In the interrupt handler:
Grab spin_lock
place data in buffer
spin_unlock
You can also avoid having an interrupt buffer and a different storage buffer - for various reasons. First, why? It is not efficient. Second, all kernel memory is unpagable so you never have to worry... just have a dynamic buffer and have a way for the syscall to read it. Have your spinlock protect the buffer and everyone is happy.
An even better solution might be a double buffer...
Robert Love
questions
Why spin_lock/spin_unlock need to be used in the interrupt handler? Is the read syscall able to preempt the isr?
I believe the spinlock
I believe the spinlock pretects the interrupt buffer from
the same/other ISR executing on other CPUS.
Re: Kernel Korner: Kernel Locking Techniques
Thank you very much for your writing. ^_^
Re: Kernel Korner: Kernel Locking Techniques
Excellent article! Very informative and well-written. Robert Love obviously has hands-on experience of the subject and knows how to share it in a very readable article.
I hope to read more articles from him.
Re: Kernel Korner: Kernel Locking Techniques
Indeed! This was one of the better articles/papers on locking and races I have read for any OS. It makes sense and is very applicable.
I hope to see (many) more articles, too.