Breaking through the Maximum Process Number

Breaking through the maximum process number restriction in i386-based Linux.
Implementation Brief

The basis of our solution is to set the process TSS and LDT descriptor dynamically (see Listing 1).

Listing 1. System Initialization

Process Switch

In the original design, when doing the fork operation, the tss.ldt and tss.tr in PCB are used to save selectors in LDTR and TR. According to the original algorithm, the selector of the LDT of a process may exceed its 16-bit limit. So we use extra variable tss.__ldth with tss.ldt to save the selector. Since tss.__ldth is not used in Linux 2.2.x, our modification won't break the kernel. The saving of LDTR and TR now works like this:

((unsigned long *) & (p->tss.ldt)) =
   (unsigned long)_LDT(nr);
if (*((unsigned long *) & (p->tss.ldt)) <
   (unsignedlong)(8192<< 3)
        set_ldt_desc(nr,ldt, LDT_ENTRIES);
        // original code here else{
        //do nothing
        //let the process switch code handle LDT
        //and TSS
}

One of the benefits of this implementation is that we can easily discover if this process number is greater than 4088 by inspecting the value of tss.ldt. This is important for performance.

If a process number is greater than 4,088, it has no reserved descriptor in GDT and must use the shared GDT entries. We can find these entries by this code:

SHARED_TSS_ENTRY + smp_processor_id();

Listing 2 shows the code for dealing with the shared GDT entries.

Listing 2. Using Shared GOT Entries

After doing these, we have broken through the maximum process number restriction. We can even add an extra parameter in the lilo configuration file to set this number dynamically. The following line will set the maximum process number to 40,000, which is much greater than 4,090:

        Append = "nrtasks=40000"
Conclusion

According to the above solution, we can set the upper limit of concurrent process number to 2G, in theory. But in practice, hardware and OS still limit this number. When creating a new process, the kernel will allocate memory for it, like this:

Process stack (2 pages) + page table (1 page)
+ page directory table (1 page) = 4 pages

So if the computer has 1G memory and uses five pages per process where the OS uses 20M of memory, the maximum process number can be:

(1G - 20M) / 20K = 51404 ~= 50,000
More practically, a process will use 30K memory at least, so the number now is:
50000 * (2/3) = 33,000
This number is still much greater than 4,090.

Zhang Yong (leon@xteamlinux.com.cn) is a senior software engineer of Xteam Software Co., Ltd. His work covers many aspects of Linux, including kernel development, Linux I18N&I10N and network applications, etc. He is currently focusing on the upcoming new release of XteamServer, which is a high-end server solution based on Linux. Xteam Software Co., Ltd. is also the vendor of XteamLinux and XteamLindows. They are both the most popular Linux Distributions in China. For more information, please visit http://www.xteamlinux.com.cn/.

______________________

Webcast
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers

Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions