The Linux Scheduler
Linux can only do user preemption. Linus Torvalds, it seems, doesn't believe in kernel preemption. That's not as bad as it may seem; all is fine for semaphores. Critical sections protected by semaphores can be preempted at any time, as every contention will end in a schedule and there can't be any deadlock. However, critical sections protected by fast spin locks or by hand locks cannot be preempted unless we block the timer IRQ. So all spin locks should be IRQ-safe. Also, by avoiding kernel preemption, the kernel becomes more robust and simpler, since a lot of complicated code can be saved this way.
By the way, there are tools to monitor the scheduler latency in order to allow the interested hacker to catch potential code sections that need conditional schedules.
Linux, due to its GPL nature, allows us to do things faster than others, because you can adapt and recompile your own kernel instead of using a standard fit-all kernel. For example, in some popular proprietary operating systems such as Solaris 7, many code sections have been packaged within spin_lock and spin_unlock to make the same code work well in both UP and SMP systems. While vendors of these commercial operating systems tout this as a clear advantage for heavy SMP systems, these locks actually bog down UP systems and simple SMP machines, because the same binary driver must work on both SMP and UP kernels. A spin_unlock is one locked ASM instruction:
#define spin_unlock_string \ "lock ; btrl $0,%0"
and calling such clear-bit instruction through a function pointer is complete overkill.
Linux, on the other hand, has in-line code sections for UP and SMP systems, adapting to the machine it is running on. Therefore, if only a UniProcessor system is hosting the kernel, no time is lost locking a code section that doesn't need it.
Last but not least, as we saw above, the goodness function makes the SMP scheduler in Linux very clever. Having a clever SMP scheduler is critical for performance. If the scheduler is not SMP-aware, OS theory teaches us that the performance on an SMP machine can be even worse than on a UP machine. (This was happening in 2.2.x without the new heuristics, but has now been fixed.)
That's it for this month. I hope you gained some deeper understanding of how Linux schedules tasks on UP and SMP systems.
Moshe Bar (email@example.com) is an Israeli system administrator and OS researcher, who started learning UNIX on a PDP-11 with AT&T UNIX Release 6 back in 1981. He holds an M.Sc. in computer science. He has written a book, Linux Kernel Internals, to be published by McGraw-Hill this year.
|Red Hat Enterprise Linux 7.1 beta available on IBM Power Platform||Jan 23, 2015|
|Designing with Linux||Jan 22, 2015|
|Wondershaper—QOS in a Pinch||Jan 21, 2015|
|Ideal Backups with zbackup||Jan 19, 2015|
|Non-Linux FOSS: Animation Made Easy||Jan 14, 2015|
|Internet of Things Blows Away CES, and it May Be Hunting for YOU Next||Jan 12, 2015|
- Designing with Linux
- Wondershaper—QOS in a Pinch
- Red Hat Enterprise Linux 7.1 beta available on IBM Power Platform
- Internet of Things Blows Away CES, and it May Be Hunting for YOU Next
- Ideal Backups with zbackup
- Slow System? iotop Is Your Friend
- New Products
- Hats Off to Mozilla
- 2014 Book Roundup
- January 2015 Issue of Linux Journal: Security
Editorial Advisory Panel
Thank you to our 2014 Editorial Advisors!
- Jeff Parent
- Brad Baillio
- Nick Baronian
- Steve Case
- Chadalavada Kalyana
- Caleb Cullen
- Keir Davis
- Michael Eager
- Nick Faltys
- Dennis Frey
- Philip Jacob
- Jay Kruizenga
- Steve Marquez
- Dave McAllister
- Craig Oda
- Mike Roberts
- Chris Stark
- Patrick Swartz
- David Lynch
- Alicia Gibb
- Thomas Quinlan
- Carson McDonald
- Kristen Shoemaker
- Charnell Luchich
- James Walker
- Victor Gregorio
- Hari Boukis
- Brian Conner
- David Lane