Red Hat Enterprise Linux Cluster Suite
When mission-critical applications fail, so does your business. This often is a true statement in today's environments, where most organizations spend millions of dollars making their services available 24/7, 365 days a year. Organizations, regardless of whether they are serving external customers or internal customers, are deploying highly available solutions to make their applications highly available.
In view of this growing demand, almost every IT vendor currently is providing high-availability solutions for its specific platform. Famous commercial high-availability solutions include IBM's HACMP, Veritas' Cluster Server and HP's Serviceguard.
If you're looking for a commercial high-availability solution on Red Hat Enterprise Linux, the best choice probably is the Red Hat Cluster Suite.
In early 2002, Red Hat introduced the first member of its Red Hat Enterprise Linux family of products, Red Hat Enterprise Linux AS (originally called Red Hat Linux Advanced Server). Since then, the family of products has grown steadily, and it now includes Red Hat Enterprise Linux ES (for entry- and mid-range servers) and Red Hat Enterprise Linux WS (for desktops/workstations). These products are designed specifically for use in enterprise environments to deliver superior application support, performance, availability and scalability.
The original release of Red Hat Enterprise Linux AS version 2.1 included a high-availability clustering feature as part of the base product. This feature was not included in the smaller Red Hat Enterprise Linux ES product. However, with the success of the Red Hat Enterprise Linux family, it became clear that high-availability clustering was a feature that should be made available for both AS and ES server products. Consequently, with the release of Red Hat Enterprise Linux version 3 in October 2003, the high-availability clustering feature was packaged into an optional layered product called the Red Hat Cluster Suite, and it was certified for use on both the Enterprise Linux AS and Enterprise Linux ES products.
The RHEL cluster suite is a separately licensed product and can be purchased from Red Hat on top of Red Hat's base ES Linux license.
The Red Hat Cluster Suite has two major features. One is the Cluster Manager that provides high availability, and the other feature is called IP load balancing (originally called Piranha). The Cluster Manager and IP load balancing are complementary high-availability technologies that can be used separately or in combination, depending on application requirements. Both of these technologies are integrated in Red Hat's Cluster Suite. In this article, I focus on the Cluster Manager.
Table 1 shows the major components of the RHEL Cluster Manager.
Table 1. RHEL Cluster Manager Components
|Fence||fenced||Provides fencing infrastructure for specific hardware platforms.|
|DLM||libdlm, dlm-kernel||Contains distributed lock management (DLM) library.|
|CMAN||cman||Contains the Cluster Manager (CMAN), which is used for managing cluster membership, messaging and notification.|
|GFS and related locks||Lock_NoLock||Contains shared filesystem support that can be mounted on multiple nodes concurrently.|
|GULM||gulm||Contains the GULM lock management user-space tools and libraries (an alternative to using CMAN and DLM).|
|Rgmanager||clurgmgrd, clustat||Manages cluster services and resources.|
|CCS||ccsd, ccs_test and ccs_tool||Contains the cluster configuration services dæmon (ccsd) and associated files.|
|Cluster Configuration Tool||System-config-cluster||Contains the Cluster Configuration Tool, used to configure the cluster and display the current status of the nodes, resources, fencing agents and cluster services graphically.|
|Magma||magma and magma-plugins||Contains an interface library for cluster lock management and required plugins.|
|IDDEV||iddev||Contains the libraries used to identify the filesystem (or volume manager) in which a device is formatted.|
Lock management is a common cluster infrastructure service that provides a mechanism for other cluster infrastructure components to synchronize their access to shared resources. In a Red Hat cluster, DLM (Distributed Lock Manager) or, alternatively, GULM (Grand Unified Lock Manager) are possible lock manager choices. GULM is a server-based unified cluster/lock manager for GFS, GNBD and CLVM. It can be used in place of CMAN and DLM. A single GULM server can be run in standalone mode but introduces a single point of failure for GFS. Three or five GULM servers also can be run together, in which case the failure of one or two servers can be tolerated, respectively. GULM servers usually are run on dedicated machines, although this is not a strict requirement.
In my cluster implementation, I used DLM, and it runs in each cluster node. DLM is good choice for small clusters (up to two nodes), because it removes quorum requirements as imposed by the GULM mechanism).
Based on DLM or GLM locking functionality, there are two basic techniques that can be used by the RHEL cluster for ensuring data integrity in concurrent access environments. The traditional way is the use of CLVM, which works well in most RHEL cluster implementations with LVM-based logical volumes.
Another technique is GFS. GFS is a cluster filesystem that allows a cluster of nodes to access simultaneously a block device that is shared among the nodes. It employs distributed metadata and multiple journals for optimal operation in a cluster. To maintain filesystem integrity, GFS uses a lock manager (DLM or GULM) to coordinate I/O. When one node changes data on a GFS filesystem, that change is visible immediately to the other cluster nodes using that filesystem.
Hence, when you are implementing a RHEL cluster with concurrent data access requirements (such as, in the case of an Oracle RAC implementation), you can use either GFS or CLVM. In most Red Hat cluster implementations, GFS is used with a direct access configuration to shared SAN from all cluster nodes. However, for the same purpose, you also can deploy GFS in a cluster that is connected to a LAN with servers that use GNBD (Global Network Block Device) or two iSCSI (Internet Small Computer System Interface) devices.
Both GFS and CLVM use locks from the lock manager. However, GFS uses locks from the lock manager to synchronize access to filesystem metadata (on shared storage), while CLVM uses locks from the lock manager to synchronize updates to LVM volumes and volume groups (also on shared storage).
For nonconcurrent RHEL cluster implementations, you can rely on CLVM, or you can use native RHEL journaling-based techniques (such as ext2 and ext3). For nonconcurrent access clusters, data integrity issues are minimal; I tried to keep my cluster implementations simple by using native RHEL OS techniques.
- Integrating Trac, Jenkins and Cobbler—Customizing Linux Operating Systems for Organizational Needs
- Tech Tip: Really Simple HTTP Server with Python
- EdgeRouter Lite
- Non-Linux FOSS: Remember Burning ISOs?
- Returning Values from Bash Functions
- Using Django and MongoDB to Build a Blog
- Cooking with Linux - Serious Cool, Sysadmin Style!
- RSS Feeds
- Hack and / - Linux Troubleshooting, Part I: High Load