Expanding Options for Clustering
Ten years ago, clustering meant one thing—grouping a small number of large servers together with shared disks to host on-line transaction processing applications with better availability and, in some cases, scalability. Since then, we've seen the rise of internet-based applications and thin commodity servers, and a whole new model of open, scalable, distributed computing built upon Linux.
Many vendors in the Linux space are working on reproducing the kind of clustering that existed ten years ago on traditional UNIX OLTP systems. This may not be the right strategy. Rather, clustering should look toward the types of applications and architectures that are being deployed in the internet data center and on sets of thin Linux servers. In the internet data center—whether in a hosting environment or in a corporation—application architectures have several characteristics that are fundamentally different from the architectures of the old glass house. In particular, the internet data center architectures share some important common themes:
Architectures are multitiered (usually web servers, application servers, and database servers).
Each tier may have literally dozens of servers.
Each tier has unique requirements for shared and replicated data.
In a large e-business or web content site, managing the entire distributed set of resources that provide a service over the Web is a bigger challenge than simply providing failover for individual servers.
While some applications, like OLTP or some kinds of e-commerce failover, still require shared disk solutions, modern applications, in many cases, can take advantage of software-based replication provided by middleware. What they lack is the infrastructure to bind those solutions together and integrate them in the larger computing environment. Application management—including failover—remains one of the key problems facing today's operations manager, but the options are, fortunately, getting better.
The applications that do require shared access to data now often involve dozens of machines, potentially in multiple Points of Presence (POPs). This means that even the shared disk solutions of traditional clustering, which are based on physically shared SCSI connections, may not be adequate. Those solutions are bound by physical cabling and SCSI address limitations and usually can only service two or four systems.
The web server tier, a stronghold of Linux, is characterized by either read-only or dynamically created data. This data can be, and often is, shared across multiple servers by using network-attached storage appliances running NFS. Each server mounts the NFS appliance and has access to the right data. For this tier, complicated shared SCSI cabling schemes are totally inappropriate—a TCP/IP network provides all the interconnect between servers and storage. However, clustering software that can monitor the health and performance of web servers, and take appropriate actions based on the input from those monitors, is still very important.
In the traditional corporate glass house, applications were typically client/server and were based on a relational database. For the most part, data was not stored in the application layer—everything was put in the database. The mix is much richer in the Internet/Linux world. The emergence of Java and Enterprise JavaBeans has led developers to serialize application-layer objects in the file system, outside the database. In addition, applications now manipulate a much richer universe of data—including digital streaming media, text and other data types that are not naturally stored in SQL back-ends.
For this reason, the applications layer has shared data and failover requirements that will be met by neither network-attached storager or shared SCSI. Network-attached storage typically does not work because NFS protocols do not provide enough consistency guarantees to be suitable where multiple servers may be writing the same data. Shared SCSI is not sufficient because replication nor sharing may need to happen across many systems or across multiple POPs. In many cases, therefore, applications provide their own replication. For example, consider a foreign-exchange trading application in use at several dozen large banks. Replication of real-time trading data is maintained at the application level, combined with cluster-based failover.
The database layer is the domain of traditional clustering solutions from the big UNIX vendors—Sun, HP and IBM. Originally, two servers would be connected to some SCSI disks, with one server being an active master and the other being a passive standby. In more modern implementations, multiple servers share a fiber channel or SCSI storage system and use a parallel database like Oracle Parallel Server (OPS) to manage concurrent access by each node in the cluster.
However, high-end fiber channel SANs are expensive, as is software like Oracle Parallel Server. Often, the price/performance of these types of solutions are not appropriate for the thin server Linux market or for the internet data center. The active/passive SCSI approach is also wasteful of resources and suffers limited scalability. Fortunately, there are other approaches more suited to the kinds of applications being run on Linux.
For example, database vendors have put much effort into shared-nothing software replication approaches (Informix Replicator, DB/2 Replication, Oracle Multi-master and hot standby). Using these tools, it is possible for multiple servers to participate at the database tier without having shared storage. However, the databases themselves often lack the facilities to detect failure, sequence the actions to initiate failover and guarantee application integrity. For this reason, they are best paired with a clustering solution that provides these services.
A real-life example is a chip fabrication facility running a manufacturing automation application on Intel-based thin servers. Because of the value of this application, the chip manufacturer requires three-way replication, so that even if the primary fails, there are two machines ready to take over. The solution chosen was to use Informix database-replication services running on physically separated machines without shared storage in conjunction with a cluster management engine.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Back to Backups
- A New Version of Rust Hits the Streets
- Google's Abacus Project: It's All about Trust
- Secure Desktops with Qubes: Introduction
- Seeing Red and Getting Sleep
- Fancy Tricks for Changing Numeric Base
- Secure Desktops with Qubes: Installation
- Working with Command Arguments
- CentOS 6.8 Released
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide