PVFS: A Parallel Virtual File System for Linux Clusters
Using networked file systems is a common method for sharing disk space on UNIX-like systems, including Linux. Sun was the first to embrace this technology by introducing the Network File System (NFS), which provides file sharing via the network. NFS is a client/server system that allows users to view, store and update files on remote computers as though they were on the user's own computer. NFS has since become the standard for file sharing in the UNIX community. Its protocol uses the Remote Procedure Call method of communication between computers.
Using NFS, the user or a system administrator can mount all or a portion of a file system. The portion of your file system that is mounted can be accessed with whatever privileges accompany your access to each file (read-only or read-write).
As the popularity and utility of this type of system have grown, more networked file systems have appeared. These new systems include advances in reliability, security, scalability and speed.
As part of my responsibilities in the Systems Research Department at Ericsson Research Canada, I evaluated Linux-networked file systems to decide what networked file system(s) to adopt for our Linux Clusters. At this stage, we are experimenting with Linux and clustering technologies and trying to build a Linux cluster that provides extremely high scalability and high availability.
An important factor in building such a system is the choice of the networked file system(s) with which it will be used. Among the tested file systems were Coda, Intermezzo, Global File System (GFS), MOSIX File System (MFS) and the Parallel Virtual File System (PVFS). After considering these and other options, the decision was made to adopt PVFS as the networked file system for our test Linux cluster. We are also using the MOSIX file system as part of the MOSIX package (see Resources) that enhances the Linux kernel with cluster-computing capabilities.
In this article, we cover our initial experiences with the PVFS system. We first discuss the design of the PVFS system in order to help familiarize readers with the terminology and components of PVFS. Next, we cover installation and configuration on the 7 CPU Linux Cluster at the Ericsson Systems Research Lab in Montréal. Finally, we discuss the strengths and weaknesses of the PVFS system in order to help others decide if PVFS is right for them.
Linux cluster technology has matured and undergone many improvements in the last few years. Commodity hardware speed has increased, and parallel software has become more advanced. Input/Output (I/O) support has traditionally lagged behind computational advances, however. This limits the performance of applications that process large amounts of data or rely on out-of-core computation.
PVFS was constructed with two main objectives. The foremost is to provide a platform for further research into parallel file systems on Linux clusters. The second objective is to meet the growing need for a high-performance parallel file system for such clusters. PVFS goals are to:
Provide high bandwidth for concurrent read/write operations from multiple processes to a common file
Support multiple APIs, including a native PVFS API, the UNIX/POSIX I/O API, as well as MPI-IO (through ROMIO)
Support Common Unix utilities such as ls, cp and rm for PVFS files
Provide a mechanism for applications developed for the UNIX I/O API to work with PVFS without recompiling
Offer robustness and scalability
Be easy to install and use
One machine, or node, in a cluster, can play a number of roles in the PVFS system. A node can be thought of as being one or more of three different types: compute, I/O or management. Typically, a single node will serve as a management node, while a subset of the nodes will be compute nodes and another subset will serve as I/O nodes. It is also possible to use all nodes as both I/O and compute nodes.
PVFS exists as a set of dæmons and a library of calls to access the file system. There are two types of dæmons, management and I/O. Typically, a single-management dæmon runs on the management node and a number of I/O dæmons run on the I/O nodes. The library of calls is used by applications running on compute nodes, or client nodes, in order to communicate with both the management dæmon and the I/O dæmons.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- July 2016 Issue of Linux Journal
- Tibbo Technology's Tibbo Project System
- Client-Side Performance
- Peppermint 7 Released
- Sony Settles in Linux Battle
- Libarchive Security Flaw Discovered
- Profiles and RC Files
- The Giant Zero, Part 0.x
- Snappy Moves to New Platforms
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide