Kernel Locking Techniques
Semaphores in Linux are sleeping locks. Because they cause a task to sleep on contention, instead of spin, they are used in situations where the lock-held time may be long. Conversely, since they have the overhead of putting a task to sleep and subsequently waking it up, they should not be used where the lock-held time is short. Since they sleep, however, they can be used to synchronize user contexts whereas spinlocks cannot. In other words, it is safe to block while holding a semaphore.
In Linux, semaphores are represented by a structure, struct semaphore, which is defined in include/asm/semaphore.h. The structure contains a pointer to a wait queue and a usage count. The wait queue is a list of processes blocking on the semaphore. The usage count is the number of concurrently allowed holders. If it is negative, the semaphore is unavailable and the absolute value of the usage count is the number of processes blocked on the wait queue. The usage count is initialized at runtime via sema_init(), typically to 1 (in which case the semaphore is called a mutex).
Semaphores are manipulated via two methods: down (historically P) and up (historically V). The former attempts to acquire the semaphore and blocks if it fails. The later releases the semaphore, waking up any tasks blocked along the way.
Semaphore use is simple in Linux. To attempt to acquire a semaphore, call the down_interruptible() function. This function decrements the usage count of the semaphore. If the new value is less than zero, the calling process is added to the wait queue and blocked. If the new value is zero or greater, the process obtains the semaphore and the call returns 0. If a signal is received while blocking, the call returns -EINTR and the semaphore is not acquired.
The up() function, used to release a semaphore, increments the usage count. If the new value is greater than or equal to zero, one or more tasks on the wait queue will be woken up:
struct semaphore mr_sem; sema_init(&mr_sem, 1); /* usage count is 1 */ if (down_interruptible(&mr_sem)) /* semaphore not acquired; received a signal ... */ /* critical region (semaphore acquired) ... */ up(&mr_sem);
The Linux kernel also provides the down() function, which differs in that it puts the calling task into an uninterruptible sleep. A signal received by a process blocked in uninterruptible sleep is ignored. Typically, developers want to use down_interruptible(). Finally, Linux provides the down_trylock() function, which attempts to acquire the given semaphore. If the call fails, down_trylock() will return nonzero instead of blocking.
In addition to the standard spinlock and semaphore implementations, the Linux kernel provides reader/writer variants that divide lock usage into two groups: reading and writing. Since it is typically safe for multiple threads to read data concurrently, so long as nothing modifies the data, reader/writer locks allow multiple concurrent readers but only a single writer (with no concurrent readers). If your data access naturally divides into clear reading and writing patterns, especially with a greater amount of reading than writing, the reader/writer locks are often preferred.
The reader/writer spinlock is called an rwlock and is used similarly to the standard spinlock, with the exception of separate reader/writer locking:
rwlock_t mr_rwlock = RW_LOCK_UNLOCKED; read_lock(&mr_rwlock); /* critical section (read only) ... */ read_unlock(&mr_rwlock); write_lock(&mr_rwlock); /* critical section (read and write) ... */ write_unlock(&mr_rwlock);
Likewise, the reader/writer semaphore is called an rw_semaphore and use is identical to the standard semaphore, plus the explicit reader/writer locking:
struct rw_semaphore mr_rwsem; init_rwsem(&mr_rwsem); down_read(&mr_rwsem); /* critical region (read only) ... */ up_read(&mr_rwsem); down_write(&mr_rwsem); /* critical region (read and write) ... */ up_write(&mr_rwsem);Use of reader/writer locks, where appropriate, is an appreciable optimization. Note, however, that unlike other implementations reader locks cannot be automatically upgraded to the writer variant. Therefore, attempting to acquire exclusive access while holding reader access will deadlock. Typically, if you know you will need to write eventually, obtain the writer variant of the lock from the beginning. Otherwise, you will need to release the reader lock and re-acquire the lock as a writer. If the distinction between code that writes and reads is muddled such as this, it may be indicative that reader/writer locks are not the best choice.
Free DevOps eBooks, Videos, and more!
Regardless of where you are in your DevOps process, Linux Journal can help!
We offer here the DEFINITIVE DevOps for Dummies, a mobile Application Development Primer, and advice & help from the expert sources like:
- Linux Journal
- Users, Permissions and Multitenant Sites
- New Products
- Flexible Access Control with Squid Proxy
- Security in Three Ds: Detect, Decide and Deny
- High-Availability Storage with HA-LVM
- Tighten Up SSH
- DevOps: Everything You Need to Know
- Solving ODEs on Linux
- Non-Linux FOSS: MenuMeters
- diff -u: What's New in Kernel Development