Breaking through the Maximum Process Number
Process management is the most import part of an operating system. Its design and implementation can greatly affect performance. In a multiprocess OS, many processes run simultaneously, thus increasing the CPU usage and system performance. By running processes concurrently, we provide multiple services and serve more clients at the same time, which is the main task of a modern OS.
In the Linux Intel i386 architecture, multiprocess is already supported. By choosing proper process scheduling algorithms, it has lower average response time and relatively high system performance. But unfortunately, there is a limitation in the Linux kernel 2.2.x that limits the number of running processes to 4090. This number may be enough for a desktop system but is inadequate for an enterprise server.
Consider the basic principle of a typical web server, which is based on multiprocess/multithread technology. When a client request comes, the web server creates a child process or thread to handle the request. So it is easy for a heavy load server to have thousands of processes running. In fact, most of such enterprise servers run operating systems like Solaris, AIX, HP-UX, etc., rather than Linux.
Many Linux developers have noticed this problem and have tried to solve it. In experimental version 2.3.x and prerelease 2.4, this limitation has been dealt with. But, it will still be a while before the official release of 2.4, and it may take even longer for it to be stable. Does this mean we have to choose another OS? Is it possible to find a solution that can break through that limitation for Linux 2.2.x? In order to answer this question, first we have to know how process management in 2.2.x works.
Process management is tightly bound with memory management. Since the implementation of memory management is based on hardware architecture, we have to have a look at the i386 architecture first. In modern operating systems, virtual memory technology is widely employed. Thanks to virtual memory technology, software can use more memory than is physically present. That is to say, the memory addresses used by software are virtual and are converted to real address by processor-provided mechanisms during access.
There are two basic memory management methods: segmentation and paging. Segmentation means dividing memory into several segments and accessing memory by both segment pointer and offset. This method is used in early systems like PDP-11, etc. Paging means dividing memory into several fixed-size pages and using pages as the basic memory management unit. When accessing memory, an address is converted to a physical address according to the page table.
Memory management in the i386 architecture is called segmentation with paging. The virtual address space is divided into segments first by using two tables: the Global Descriptor Table (GDT) and Local Descriptor Table (LDT). After this, the virtual address is converted to a linear address. Then the linear address is converted to a physical address using two-level page tables: the Page Directory Table and Page Table. Figure 1 shows how the virtual address has been converted to a real address.
In Linux, the kernel runs in ring 0. By setting GDT, the kernel puts its code and data into a separate address space. All other programs run in ring 3 with their data and code in the same address space. Creating different page tables protects those user programs. The GDT table in Linux 2.2.x is shown below in Figure 2. In practice, a user program can use other code/data segments by setting LDT.
A process is a running program with all resources allocated. It is a dynamic concept. In the i386 architecture, “task” is an alternative name for process. For convenience, here we will use process only. Process management is a concept concerned with system initialization, process creation and destruction, scheduling, interprocess communication, etc. In Linux, process is actually a group of data structures including the context of process, scheduling data, semaphores, process queue, process id, time, signals, etc. This group of data is called Process Control Block or PCB. In implementation, PCB is in the bottom of the process stack.
Process management in Linux relies greatly on the hardware architecture. We have just discussed the basis of page-with-segment memory management in i386, but in fact, segment plays a more important role than just a block of memory. For example, Task Status Segment is one of the most important segments in i386. It contains much data that is required by the system. Each process must have a TSS pointed by TR register. According to the definition of i386, the selector in TR must select a descriptor in GDT. Additionally, the selector in LDTR, which defines a process LDT, must have a corresponding entry in GDT as well.
In order to satisfy the above requirements, Linux 2.2.x GDT is allocated for all possible processes. The maximum concurrent process number is defined when booting the kernel. The kernel reserves 2 GDT entries for each process.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- ServersCheck's Thermal Imaging Camera Sensor
- The Italian Army Switches to LibreOffice
- Linux Mint 18
- Petros Koutoupis' RapidDisk
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Privacy and the New Math
- Ben Rady's Serverless Single Page Apps (The Pragmatic Programmers)
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide