One Box. Sixteen Trillion Bytes.
I recently had the need for a lot of disk space, and I decided to build a 16TB server on my own from off-the-shelf parts. This turned out to be a rewarding project, as it involved many interesting topics, including hardware RAID, XFS, SATA and system management issues involved with large filesystems.
I wanted to consolidate several Linux file servers that I use for disk-to-disk backups. These were all in the 3–4TB range and were constantly running out of space, requiring me either to adjust which systems were being backed up to which server or to reduce the number of previous backups that I could keep on hand. My overall goal for this project was to create a system with a large amount of cheap, fast and reliable disk space. This system would be the destination for a number of daily disk-to-disk backups from a mix of Solaris, Linux and Windows servers. I am familiar with Linux's software RAID and LVM2 features, but I specifically wanted hardware RAID, so the OS would be “unaware” of the RAID controller. These features certainly cost more than a software-based RAID system, and this article is not about creating the cheapest possible solution for a given amount of disk space.
The hardware RAID controller would make it as simple as possible for a non-Linux administrator to replace a failed disk. The RAID controller would send an e-mail message warning about a disk failure, and the administrator typically would respond by identifying the location of the failed disk and replacing it, all with no downtime and no Linux administration skills required. The entire disk replacement experience would be limited to the Web interface of the RAID controller card.
In reality, a hot spare disk would replace any failed disk automatically, but use of the RAID Web interface still would be required to designate any newly inserted disk as the replacement hot spare. For my company, I had specific concerns about the availability of Linux administration skills that justified the expense of hardware RAID.
For me, the above requirements meant using hot-swappable 1TB SATA drives with a fast RAID controller in a system with a decent CPU, adequate memory and redundant power supplies. The chassis had to be rack-mountable and easy to service. Noise was not a factor, as this system would be in a dedicated machine room with more than one hundred other servers.
I decided to build the system around the 3ware 9560 16-port RAID controller, which requires a motherboard that has a PCI Express slot with enough “lanes” (eight in this instance). Other than this, I did not care too much about the CPU choice or integrated motherboard features (other than Gigabit Ethernet). As I had decided on 16 disks, this choice pretty much dictated a 3U or larger chassis for front-mounted hot-swap disks. This also meant there was plenty of room for a full-height PCI card in the chassis.
I have built the vast majority of my rackmount servers (more than a hundred) using Supermicro hardware, so I am quite comfortable with its product line. In the past, I have always used Supermicro's “bare-bones” units, which had the motherboard, power supply, fans and chassis already integrated.
For this project, I could not find a prebuilt bare-bones model with the exact feature set I required. I was looking for a system that had lots of cheap disk capacity, but did not require lots of CPU power and memory capacity—most high-end configurations seemed to assume quad-core CPUs, lots of memory and SAS disks. The Supermicro SC836TQ-R800B chassis looked like a good fit to me, as it contained 16 SATA drives in a 3U enclosure and had redundant power supplies (the B suffix indicates a black-colored front panel).
Next, I selected the X7DBE motherboard. This model would allow me to use a relatively inexpensive dual-core Xeon CPU and have eight slots available for memory. I could put in 8GB of RAM using cheap 1GB modules. I chose to use a single 1.6GHz Intel dual-core Xeon for the processor, as I didn't think I could justify the cost of multiple CPUs or top-of-the-line quad-core models for the file server role.
I double-checked the description of the Supermicro chassis to make sure that the CPU heat sink is included with the chassis. For the SC836TQ-R800B, the heat sink had to be ordered separately.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- Bought photoshop CS5 for developing a website :(
1 hour 27 min ago - What the author describes
2 hours 53 min ago - Reply to comment | Linux Journal
7 hours 3 min ago - Reply to comment | Linux Journal
7 hours 49 min ago - Didn't read
7 hours 59 min ago - Reply to comment | Linux Journal
8 hours 4 min ago - Poul-Henning Kamp: welcome to
10 hours 14 min ago - This has already been done
10 hours 15 min ago - Reply to comment | Linux Journal
11 hours 44 sec ago - Welcome to 1998
11 hours 49 min ago
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?





Comments
A Problem with device driver programming
hello
i am writing a driver for a board which uses AMCC s5935 and a EEPROM.
i use lspci -x to see the card`s information but i see the wrong values. i have tested it on MS DOS and Windows but i see this wrong values again.
once i prepared a Windriver for this card and i saw the correct values . i wanna write a program on linux and i need help .what should i do?
i see a wrong number on Base Address Register0.
but it must be something else.
i must add BaseAddressRegister0 with 0x3c and read its address but i read something wrong and i am confused.
i would be very great full if u could help me.
thanx
speed test on similar system
Readers of this might be interested in some benchmarks on a similar system broken down by filesystem, types of apps, #s of disks, types of RAID, etc.
The Storage Brick - Fast, Cheap, Reliable Terabytes
http://moo.nac.uci.edu/~hjm/sb/index.html
Cheers
Harry