Use Linux as a SAN Provider
Storage Area Networks (SANs) are becoming commonplace in the industry. Once restricted to large data centers and Fortune 100 companies, this technology has dropped in price to the point that small startups are using them for centralized storage. The strict definition of a SAN is a set of storage devices that are accessible over the network at a block level. This differs from a Network Attached Storage (NAS) device in that a NAS runs its own filesystem and presents that volume to the network; it does not need to be formatted by the client machine. Whereas a NAS usually is presented with the NFS or CIFS protocol, a SAN running on the same Ethernet often is presented as iSCSI, although other technologies exist.
iSCSI is the same SCSI protocol used for local disks, but encapsulated inside IP to allow it to run over the network in the same way any other IP protocol does. Because of this, and because it is seen as a block device, it often is almost indistinguishable from a local disk from the point of view of the client's operating system and is completely transparent to applications.
The iSCSI protocol is defined in RFC 3720 and runs over TCP ports 860 and 3260. In addition to the iSCSI protocol, many SANs implement Fibre Channel as a mechanism. This is an improvement over Gigabit Ethernet, mainly because it is 4 or 8Gb/s as opposed to 1Gb/s. In the same vein, 10 Gigabit Ethernet would have an advantage over Fibre Channel.
The downside to Fibre Channel is the expense. A Fibre Channel switch often runs many times the cost of a typical Ethernet switch and comes with far fewer ports. There are other advantages to Fibre Channel, such as the ability to run over very long distances, but these aren't usually the decision-making factors when purchasing a SAN.
In addition to Fibre Channel and iSCSI, ATA over Ethernet (AoE) also is starting to make some headway. In the same way that iSCSI provides SCSI commands over an IP network, AoE provides ATA commands over an Ethernet network. AoE actually is running directly on Ethernet, not on top of IP the way iSCSI does. Because of this, it has less overheard and often is faster than iSCSI in the same environment. The downside is that it cannot be routed. AoE also is far less mature than iSCSI, and fewer large networking companies are looking to support AoE. Another disadvantage of AoE is that it has no built-in security outside of MAC filtering. As it is relatively easy to spoof a MAC address, this means anyone on the local network can access any AoE volumes.
The first step in moving down the road to a SAN is the choice of whether to use it. Although a SAN often is faster than a NAS, it also is less flexible. For example, the size of or the filesystem of a NAS usually can be changed on the host system without the client system having to make any changes. With a SAN, because it is seen as a block device like a local disk, it is subject to a lot of the same rules as a local disk. So, if a client is running its /usr filesystem on an iSCSI device, it would have to be taken off-line and modified not just on the server side, but also on the client side. The client would have to grow the filesystem on top of the device.
There are some significant differences between a SAN volume and a local disk. A SAN volume can be shared between computers. Often, this presents all kinds of locking problems, but with an application aware that its volume is shared out to multiple systems, this can be a powerful tool for failover, load balancing or communication. Many filesystems exist that are designed to be shared. GFS from Red Hat and OCFS from Oracle (both GPL) are great examples of the kinds of these filesystems.
The network is another consideration in choosing a SAN. Gigabit Ethernet is the practical minimum for running modern network storage. Although a 100- or even a 10-megabit network theoretically would work, the practical results would be extremely slow. If you are running many volumes or requiring lots of reads and writes to the SAN, consider running a dedicated gigabit network. This will prevent the SAN data from conflicting with your regular IP data and, as an added bonus, increase security on your storage.
Security also is a concern. Because none of the major SAN protocols are encrypted, a network sniffer could expose your data. In theory, iSCSI could be run over IPsec or a similar protocol, but without hardware acceleration, doing so would mean a large drop in performance. In lieu of this, at the very least, keeping the SAN data on its own VLAN is required.
Because it is the most popular of the various SAN protocols available for Linux, I use iSCSI in the examples in this article. But, the concepts should transfer easily to AoE if you've selected that for your systems. If you've selected Fibre Channel, things still are similar, but not as similar. You will need to rely more on your switch for most of your authentication and routing. On the positive side, most modern Fibre Channel switches provide excellent setup tools for doing this.
To this point, I have been using the terms client and server, but that is not completely accurate for iSCSI technology. In the iSCSI world, people refer to clients as initiators and servers or other iSCSI storage devices as targets. Here, I use the Open-iSCSI Project to provide the initiator and the iSCSI Enterprise Target (IET) Project to provide the target. These pieces of software are available in the default repositories of most major Linux distributions. Consult your distribution's documentation for the package names to install or download the source from www.open-iscsi.org and iscsitarget.sourceforge.net. Additionally, you'll need iSCSI over TCP/IP in your kernel, selectable in the low-level SCSI drivers section.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Stunnel Security for Oracle
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SourceClear Open
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Google's SwiftShader Released
- Parsing an RSS News Feed with a Bash Script
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide