How Many Disks Are Too Many for a Linux System?
With recent advances in network bandwidth, resources and facilities are becoming centralized once again. The trend of departmental computers and services is fading as larger, more centralized services appear. In regards to this centralization, one of the questions most IT managers ask is “What size server is really needed?” In this article, we analyze how large a server is needed for file services and how many disks a typical Linux system can support. We assume we are dealing with a typical Intel architecture solution and that one or more high speed network connections is available to handle the load and requests from the client systems. If we assume two or more Gigabit network adaptors, this should provide enough bandwidth for most applications and data centers.
When attempting to see how many disks can fit into a system, we must analyze two parts of the problem. The first is the operating system itself. Linux has some limitations that prohibit it from hosting a million disk drives. Some of these limitations can be worked around, but others are hard limits. The second problem we face is hardware. The backplane, power requirements and physical construction of the computer physically restrict the number of disks that can be attached to a system.
We start by looking at the hardware limitations of three systems commercially available today. We chose low-end, mid-range and high-end systems: an HP Presario 4400US, a Dell PowerEdge and an IBM zSeries, respectively.
Low-end Computer: The HP Presario lists for $549 and has one extra IDE drive slot for expansion and two additional PCI slots for disk controller expansion. Because the extra disks are not able to pull enough power from the system power supply, the disks need to be attached to a separate chassis and power supply. If we continue with the HP option of the HP Surestore Disk System 2300, we can get 14 disks per disk controller. This will give us a total of 30 disks, about 2TB of storage (assuming 73GB per disk). Each of these disks will yield about 11MB/sec across the 160MB/sec communication channel. We could go with a more robust fibre channel storage solution, the HP Surestore Disk Array XP1024. It allows for 1,024 disks or about 74TB storage per controller. Unfortunately, the system bus on our Presario 4400 only goes to 100MHz, while the Surestore XP1024 runs at 2GB/sec. The system we selected would not be able to handle such a high-end disk system, thus limiting us to the Ultra160 technology and 30 disks.
Mid-range Computer: If we decide to raise our expectations and look at a Dell PowerEdge 2650, which lists for $2,199, we can attach significantly more storage. The internal controller supports five SCSI disks and has the option of three expansion slots, one at 133Hz and two at 100MHz. The motherboard backplane runs at 400MHz, so the embedded SCSI controller can operate much faster than the expansion controllers. By using the PowerVault 220S/221S SCSI Disk Storage Unit, we can attach 47 disks, three 14-unit PowerVaults and five internal disks, for a total storage size of 3.4TB. We also can expand the memory of this system to 6GB of RAM, which would handle the disk operations much better than the 512MB limitation of the Presario 4400. On this system, we also could go for a more robust fibre channel storage solution, the Dell/EMC FC4700-2. This system allows 120 drives per fibre channel, or 360 disks yielding 63TB of storage on a single system. Since this system transfers data at 360MB/sec, the backplane running at 133MHz could easily handle the operations, but we are reaching the limits for the 100MHz backplane.
High-end Computer: At the very high end, we could use the IBM zSeries 900, which supports up to 64GB of memory and multiple fibre channel connectors. There really isn't a limit to the number of disks that can be supported through this fibre channel configuration, but connection to a single disk subsystem is limited to 224 drives on a single instance (16TB per subsystem). Direct SCSI is not supported, but a fibre to SCSI interchange is available.
In summary, on the low end, the hardware limitations begin with the backplane support of multiple controllers limiting the system to 30 disks. The mid-range system tops out at 360 disks, while the high-end machine does not have a practical upper limit. All of these systems could utilize storage area networks or network attached storage to increase their limits, but initially we wanted to analyze the hardware limitations imposed by the architecture for direct attached storage.
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- RSS Feeds
- A Topic for Discussion - Open Source Feature-Richness?
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Readers' Choice Awards
- The Secret Password Is...
- All the articles you talked
2 hours 2 min ago
- All the articles you talked
2 hours 5 min ago
- All the articles you talked
2 hours 6 min ago
6 hours 31 min ago
- Keeping track of IP address
8 hours 22 min ago
- Roll your own dynamic dns
13 hours 35 min ago
- Please correct the URL for Salt Stack's web site
16 hours 47 min ago
- Android is Linux -- why no better inter-operation
19 hours 2 min ago
- Connecting Android device to desktop Linux via USB
19 hours 31 min ago
- Find new cell phone and tablet pc
20 hours 29 min ago