Use Linux as a SAN Provider
Another feature often run with iSCSI is multipathing. This allows Linux to use multiple networks at once to access the iSCSI target. It usually is run on separate physical networks, so in the event that one fails, the other still will be up and the initiator will not experience loss of a volume or a system crash. Multipathing can be set up in two ways, either active/passive or active/active. Active/active generally is the preferred way, as it can be set up not only for redundancy, but also for load balancing. Like Fibre Channel, multipath assigns World Wide Identifiers (WWIDs) to devices. These are guaranteed to be unique and unchanging. When one of the paths is removed, the other one continues to function. The initiator may experience slower response time, but it will continue to function. Re-integrating the second path allows the system to return to its normal state.
When working with local disks, people often turn to Linux's software RAID or LVM systems to provide redundancy, growth and snapshotting. Because SAN volumes show up as block devices, it is possible to use these tools on them as well. Use them with care though. Setting up RAID 5 across three iSCSI volumes causes a great deal of network traffic and almost never gives you the results you're expecting. Although, if you have enough bandwidth available and you aren't doing many writes, a RAID 1 setup across multiple iSCSI volumes may not be completely out of the question. If one of these volumes drops, rebuilding may be an expensive process. Be careful about how much bandwidth you allocate to rebuilding the array if you're in a production environment. Note that this could be used at the same time as multipathing in order to increase your bandwidth.
To set up RAID 1 over iSCSI, first load the RAID 1 module:
After partitioning your first disk, /dev/sdb, copy the partition table to your second disk, /dev/sdc. Remember to set the partition type to Linux RAID autodetect:
sfdisk -d /dev/sdb | sfdisk /dev/sdc
Assuming you set up only one partition, use the mdadm command to create the RAID group:
mdadm --create /dev/md0 --level=1 --raid-disks=2 /dev/sdb1 /dev/sdc1
After that, cat the /etc/mdstat file to watch the state of the synchronization of the iSCSI volumes. This also is a good time to measure your network throughput to see if it will stand up under production conditions.
Running a SAN on Linux is an excellent way to bring up a shared environment in a reasonable amount of time using commodity parts. Spending a few thousand dollars to create a multiterabyte array is a small budget when many commercial arrays easily can extend into the tens to hundreds of thousands of dollars. In addition, you gain flexibility. Linux allows you to manipulate the underlying technologies in ways most of the commercial arrays do not. If you're looking for a more-polished solution, the Openfiler Project provides a nice layout and GUI to navigate. It's worth noting that many commercial solutions run a Linux kernel under their shell, so unless you specifically need features or support that isn't available with standard Linux tools, there's little reason to look to commercial vendors for a SAN solution.
Michael Nugent has spent a good deal of his time designing large-scale solutions to fit into tiny budgets, leveraging Linux to fulfill roles that typically would be filled by large commercial appliances. Recently, Michael has been working to design large, private clouds for SaaS environments in the financial industry. When not building systems, he likes sailing, scuba diving and hanging out with his cat, MIDI. Michael can be reached at firstname.lastname@example.org.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- RSS Feeds
- Introduction to MapReduce with Hadoop on Linux
- Validate an E-Mail Address with PHP, the Right Way
- Weechat, Irssi's Little Brother
- Tech Tip: Really Simple HTTP Server with Python
- New Products
- Poul-Henning Kamp: welcome to
1 hour 10 min ago
- This has already been done
1 hour 11 min ago
- Reply to comment | Linux Journal
1 hour 57 min ago
- Welcome to 1998
2 hours 45 min ago
- notifier shortcomings
3 hours 9 min ago
4 hours 46 min ago
- Android User
4 hours 47 min ago
- Reply to comment | Linux Journal
6 hours 40 min ago
9 hours 30 min ago
- This is a good post. This
14 hours 43 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?