How Many Disks Are Too Many for a Linux System?
With recent advances in network bandwidth, resources and facilities are becoming centralized once again. The trend of departmental computers and services is fading as larger, more centralized services appear. In regards to this centralization, one of the questions most IT managers ask is “What size server is really needed?” In this article, we analyze how large a server is needed for file services and how many disks a typical Linux system can support. We assume we are dealing with a typical Intel architecture solution and that one or more high speed network connections is available to handle the load and requests from the client systems. If we assume two or more Gigabit network adaptors, this should provide enough bandwidth for most applications and data centers.
When attempting to see how many disks can fit into a system, we must analyze two parts of the problem. The first is the operating system itself. Linux has some limitations that prohibit it from hosting a million disk drives. Some of these limitations can be worked around, but others are hard limits. The second problem we face is hardware. The backplane, power requirements and physical construction of the computer physically restrict the number of disks that can be attached to a system.
We start by looking at the hardware limitations of three systems commercially available today. We chose low-end, mid-range and high-end systems: an HP Presario 4400US, a Dell PowerEdge and an IBM zSeries, respectively.
Low-end Computer: The HP Presario lists for $549 and has one extra IDE drive slot for expansion and two additional PCI slots for disk controller expansion. Because the extra disks are not able to pull enough power from the system power supply, the disks need to be attached to a separate chassis and power supply. If we continue with the HP option of the HP Surestore Disk System 2300, we can get 14 disks per disk controller. This will give us a total of 30 disks, about 2TB of storage (assuming 73GB per disk). Each of these disks will yield about 11MB/sec across the 160MB/sec communication channel. We could go with a more robust fibre channel storage solution, the HP Surestore Disk Array XP1024. It allows for 1,024 disks or about 74TB storage per controller. Unfortunately, the system bus on our Presario 4400 only goes to 100MHz, while the Surestore XP1024 runs at 2GB/sec. The system we selected would not be able to handle such a high-end disk system, thus limiting us to the Ultra160 technology and 30 disks.
Mid-range Computer: If we decide to raise our expectations and look at a Dell PowerEdge 2650, which lists for $2,199, we can attach significantly more storage. The internal controller supports five SCSI disks and has the option of three expansion slots, one at 133Hz and two at 100MHz. The motherboard backplane runs at 400MHz, so the embedded SCSI controller can operate much faster than the expansion controllers. By using the PowerVault 220S/221S SCSI Disk Storage Unit, we can attach 47 disks, three 14-unit PowerVaults and five internal disks, for a total storage size of 3.4TB. We also can expand the memory of this system to 6GB of RAM, which would handle the disk operations much better than the 512MB limitation of the Presario 4400. On this system, we also could go for a more robust fibre channel storage solution, the Dell/EMC FC4700-2. This system allows 120 drives per fibre channel, or 360 disks yielding 63TB of storage on a single system. Since this system transfers data at 360MB/sec, the backplane running at 133MHz could easily handle the operations, but we are reaching the limits for the 100MHz backplane.
High-end Computer: At the very high end, we could use the IBM zSeries 900, which supports up to 64GB of memory and multiple fibre channel connectors. There really isn't a limit to the number of disks that can be supported through this fibre channel configuration, but connection to a single disk subsystem is limited to 224 drives on a single instance (16TB per subsystem). Direct SCSI is not supported, but a fibre to SCSI interchange is available.
In summary, on the low end, the hardware limitations begin with the backplane support of multiple controllers limiting the system to 30 disks. The mid-range system tops out at 360 disks, while the high-end machine does not have a practical upper limit. All of these systems could utilize storage area networks or network attached storage to increase their limits, but initially we wanted to analyze the hardware limitations imposed by the architecture for direct attached storage.