Kernel Korner - Sleeping in the Kernel
Another classical operating system problem arises due to the use of the wake_up_all function. Let us consider a scenario in which a set of processes are sleeping on a wait queue, wanting to acquire a lock.
Once the process that has acquired the lock is done with it, it releases the lock and wakes up all the processes sleeping on the wait queue. All the processes try to grab the lock. Eventually, only one of these acquires the lock and the rest go back to sleep.
This behavior is not good for performance. If we already know that only one process is going to resume while the rest of the processes go back to sleep again, why wake them up in the first place? It consumes valuable CPU cycles and incurs context-switching overheads. This problem is called the thundering herd problem. That is why using the wake_up_all function should be done carefully, only when you know that it is required. Otherwise, go ahead and use the wake_up function that wakes up only one process at a time.
So, when would the wake_up_all function be used? It is used in scenarios when processes want to take a shared lock on something. For example, processes waiting to read data on a page could all be woken up at the same moment.
You frequently may want to delay the execution of your process for a given amount of time. It may be required to allow the hardware to catch up or to carry out an activity after specified time intervals, such as polling a device, flushing data to disk or retransmitting a network request. This can be achieved by the function schedule_timeout(timeout), a variant of schedule(). This function puts the process to sleep until timeout jiffies have elapsed. jiffies is a kernel variable that is incremented for every timer interrupt.
As with schedule(), the state of the process has to be changed to TASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE before calling this function. If the process is woken up earlier than timeout jiffies have elapsed, the number of jiffies left is returned; otherwise, zero is returned.
Let us take a look at a real-life example (linux-2.6.11/arch/i386/kernel/apm.c: 1415):
1415 set_current_state(TASK_INTERRUPTIBLE);
1416 for (;;) {
1417 schedule_timeout(APM_CHECK_TIMEOUT);
1418 if (exit_kapmd)
1419 break;
1421 * Ok, check all events, check for idle
.... * (and mark us sleeping so as not to
.... * count towards the load average)..
1423 */
1424 set_current_state(TASK_INTERRUPTIBLE);
1425 apm_event_handler();
1426 }
This code belongs to the APM thread. The thread polls the APM BIOS for events at intervals of APM_CHECK_TIMEOUT jiffies. As can be seen from the code, the thread calls schedule_timeout() to sleep for the given duration of time, after which it calls apm_event_handler() to process any events.
You also may use a more convenient API, with which you can specify time in milliseconds and seconds:
msleep(time_in_msec);
msleep_interruptible(time_in_msec);
ssleep(time_in_sec);
msleep(time_in_msec); and msleep_interruptible(time_in_msec); accept the time to sleep in milliseconds, while ssleep(time_in_sec); accepts the time to sleep in seconds. These higher-level routines internally convert the time into jiffies, appropriately change the state of the process and call schedule_timeout(), thus making the process sleep.
I hope that you now have a basic understanding of how processes safely can sleep and wake up in the kernel. To understand the internal working of wait queues and advanced uses, look at the implementations of init_waitqueue_head, as well as variants of wait_event and wake_up.
Greg Kroah-Hartman reviewed a draft of this article and contributed valuable suggestions.
Kedar Sovani (www.geocities.com/kedarsovani) works for Kernel Corporation as a kernel developer. His areas of interest include security, filesystems and distributed systems.
- « first
- ‹ previous
- 1
- 2
- 3
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Nice article, thanks for the
6 hours 24 min ago - I once had a better way I
12 hours 10 min ago - Not only you I too assumed
12 hours 27 min ago - another very interesting
14 hours 20 min ago - Reply to comment | Linux Journal
16 hours 13 min ago - Reply to comment | Linux Journal
23 hours 8 min ago - Reply to comment | Linux Journal
23 hours 24 min ago - Favorite (and easily brute-forced) pw's
1 day 1 hour ago - Have you tried Boxen? It's a
1 day 7 hours ago - seo services in india
1 day 11 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
lost wakeup in wait_event_interruptible?
Hello,
I see a 'lost wakeup' even in latest wait_event_interruptible. I know race condition got solved after sleep_on. So, I just want to know how/why this works w/o any problem. I saw a similar question but no answers (code snippet from 2.6.27 below)
thanks,
Shankar.
----
a) Wakeup occurs immediately before the call to prepare_to_wait()
b) Call to prepare_to_wait() sets process state to TASK_INTERRUPTIBLE
c) While the prepare_to_wait exits but just before condition is evaluated again, a h/w interrupt comes in!
d) The process is now still marked as TASK_INTERRUPTIBLE and will therefore not be
re-scheduled and will never execute the call to finish_wait() - so it will sleep
unless/until an interruption comes along...
-----
#define __wait_event_interruptible(wq, condition, ret) \
do { \
DEFINE_WAIT(__wait); \
\
for ( ; ; ) { \
prepare_to_wait(&wq, &__wait, TASK_INTERRUPTIBLE); \
if (condition) \
break; \
if (!signal_pending(current)) { \
schedule(); \
continue; \
} \
ret = -ERESTARTSYS; \
break; \
} \
finish_wait(&wq, &__wait); \
} while (0)
#define wait_event_interruptible(wq, condition) \
({ \
int __ret = 0; \
if (!(condition)) \
__wait_event_interruptible(wq, condition, __ret); \
__ret; \
})
wait_event_interruptible - retuns error (ERESTARTSYS)
Hi,
I am using wait_for_interruptable in
read function and the isr wakes it up using wake_up_interruptable.
Some times wait_for_interruptable returns error ERESTARTSYS.
The application usage scenario is as follows: I have a process which has lots of threads. And one thread has read() function. Sometimes wake_up_interruptable() function return error and the full user-application is killed. But the system contines to run with out any issues.
There are two possibilities this problem can occur:
1) There are many threads in my process, some thread has caused some problem which results in killing the process. So the process sends signal to the thread which is doing read and the read thread which is waiting on wake_up_interruptable() comes out with error. One more observation here is that the "release" function of my driver is called when this happens.
2) Second possibility is that - Some thing wrong is happening in the driver and it is giving the error for wait_for_interruptable() and this inturn kills the userspace process. I am not very sure about this. But can this kind of thing happen?
please provide some inputs on this ..
Waitqueues in bottom halve context
Can we use wait queue in bottom halve context?
For example, we have LOC in the following source
Kernel Version: Linux-2.6.18.5
Path: \net\xfrm\xfrm_policy.c
Ln: 919
Which calls schedule() function.
This function is called when packet is received and looks for
policy in softirq context.
Please correct me if i misunderstood.
Thanks in advance
Re: Waitqueues in bottom halve context
The function xfrm_lookup() is called from both user context
and kernel(softirq) context.
The process is put to sleep when arguemnt:flag of the above
function is -EAGAIN.
User Context:
If this function is called from user context it may have -EAGAIN
Kernel Context:
If this function is called from kernel context (softirq), it will be NULL.
So the softirq context process will not be put to wait queue.
Thanks,
Sathish Kumar
misprint
In the line:
"This call causes the smbiod to sleep only if the DATA_READY bit is set",
"set" should read "clear", and I think this is a misprint here.
SIGKILL arrival
how will wait_event_interruptible will behave if SIGKILL happens to the user space process that is sleeping using this function? My system gives kernel panic when i do 'kill -SIGKILL pid'. is it because of the reason that im sleeping in kernel and i delivered SIGKILL?
In the schedule function
In the schedule function section, you have mentioned that a way to wake up a process that has just called the schedule function is to call wake_up_process(sleeping_task);
But how does another process have a reference to the task structure of a process that has just called schedule.
In the code you have shown the example to get a reference to the task structure. But that is being done in the process that is calling schedule.
Please elaborate.
Thanks,
Vinay
store it somewhere
It is upto you where/how you want to store that reference to the sleeping process's task struct. You may embed it in the appropriate data structures in your code.
Waking up a sleeping process from user space
What is a good way to wake up a process that is asleep in kernel space from a process that is in user space?
Perhaps a system call which invokes 'wake_up_process'?
The 'kill' system call doesn't seem to work...
schedule_timeout_{,un}interruptible()
A patch that includes two new variants of schedule_timeout() has been included in the -mm kernel tree.
Almost all the calls to schedule_timeout(), set the state of the executing task to TASK_INTERRUPTIBLE / TASK_UNINTERRUPTIBLE. This patch embeds such functionality in the following calls,
schedule_timeout_interruptible() and,
schedule_timeout_uninterruptible()
Error in code samples
There is an error in the code samples. wait_event() and wait_event_interuptible() should not be passed the address of my_event, but my_event itself. That is because they are macros, and their implementations will wind up using the address-of operator (&) to take the address of the parameter they are passed.
Re : Error in code samples
Yes, thanks for that correction. In section "Wait Queues", the illustrations should be like this :
That is because they are
That is because they are macros, and their implementations will wind up using the address-of operator (&) to take the address of the parameter they are passed.mırç mırç Chat chat türkçe mirc türkçe mirc