IBM's Journaled Filesystem
In the previous section, we explained how to create a JFS filesystem using an existing spare partition. Now, we demonstrate how to migrate your current system from another filesystem, such as ext2, to JFS. We look at how to introduce a JFS partition to your Linux configuration. In a second step, we make that partition the root filesystem.
What partition scheme do you need to create a JFS root partition? The migration process requires an empty partition. Let's assume that /dev/hda5 is the current root partition and that it uses ext2. We use /dev/hda6, which is our empty partition, as our JFS root partition. This partition needs to be of equal or larger size than the current root partition. The ext2 partition will be duplicated on the JFS partition. Afterward, if you do not wish to keep the ext2 partition, you can reformat it without losing your Linux system.
In order to create a root JFS partition on /dev/hda6, follow the instructions mentioned earlier to get support for JFS in the kernel. To make this partition a bootable partition for Linux, you need to reproduce a complete Linux installation. A simple way of doing so is to copy all the files to the JFS partition. First, mount the filesystem:
% mount -t jfs /dev/hda6 /jfs
Then, copy all files from ext2 filesystem to the JFS partition:
% cd / % cp -a bin etc lib boot dev home usr var [...] /jfs You need special handling for /proc and /tmp: % mkdir /jfs/proc % chmod 555 /jfs/proc % mkdir /jfs/tmp % chmod 1777 /jfs/tmpIt is important to create /proc and /tmp with the correct permissions. Permission 1777 means the only people who can rename or remove any file in that directory are the file's owner, the directory's owner and the superuser. The last steps involve changing /etc/lilo.conf and /etc/fstab. First, we change lilo.conf to boot using the kernel on our JFS partition. Notice that root is different from the first entry we made as well as from the label. Thus, the image to be booted will not be found in /dev/hda5/boot, but in /dev/hda6/boot:
image=/boot/vmlinuz-jfs label=jfs-kernel read-only root=/dev/hda6Finally, we need to change /jfs/etc/fstab to tell the Linux system what filesystem it is using. Change the following line:
LABEL=/ / ext2 defaults 1 1so that it says:
/dev/hda6 / jfs defaults 1 1Now, you can reboot and choose jfs-kernel. This will start Linux with the JFS root filesystem.
After a crash, the log replay occurs automatically. Instead of the usual fsck messages, you should see JFS journaling messages. Replaying the log is necessary when a filesystem becomes unstable.
One of the responsibilities of the Open Systems Lab at Ericsson Research is to design, implement and benchmark carrier-class platforms that run telecom applications. These carrier-class platforms have strict requirements regarding scalability, reliability and high availability. They must operate nonstop, regardless of hardware or software errors, and they must allow operators to upgrade hardware and software during operation without disturbing the applications that run on them. As a result, they must offer extreme reliability and high availability, often referred to as a five-nines availability (99.999% uptime).
To maintain such availability, these carrier-class platforms were designed with many features that allow software to be upgraded while the system is running and providing service. These features include fault tolerance implemented in the software, network redundancy to handle catastrophic situations such as earthquakes and in-service upgrade features.
Although many precautions have been taken to protect the system, there is always a remote chance that the processor (or server node) will fail. Thus, as a last resort, we need to reboot the processor. In this extreme case, we need to be able to reboot the processor and bring it to normal status, serving requests as soon as possible, with a minimal downtime.
Our interest in journaling filesystems for the carrier-class Linux platform came from the fact that these filesystems provide a fast filesystem restart. In the case of a system crash, a journaling filesystem can restore a filesystem to a consistent state more quickly and more reliably than other filesystems.
Initially, we started to experiment with the IBM JFS in early 2000. The JFS team was helpful and supportive, and their representative, Steve Best, visited our lab in January 2001. Since then, we have been following JFS development closely and upgrading our servers to support the latest versions.
The first installations of JFS were done on 1U rackmount units with Celeron 500MHz processors, 256MB of RAM and 20GB IDE disks. These units provided us with a working environment to test JFS and to experiment with its features using some of our applications. Since JFS version 1.0.0 was released in June 2001, we decided to install JFS on our test Linux platform, shown in Figure 2.
Our Linux systems are designed to serve short transaction-based requests. JFS provides a log-based, byte-level filesystem targeted for transaction-oriented systems, which makes it quite suitable for our type of systems.
The advantages of JFS from a telecom point of view are that it provides improved structural consistency, recoverability and much faster restart times than non-journaling filesystems, such as traditional UNIX filesystems. In most cases, the other filesystems are subject to corruption in the event of a system crash. They rely on restart-time utilities like fsck, which examines all of the filesystem's metadata to detect and repair structural integrity problems. This is a time-consuming and error-prone process; in a worst-case scenario, it can lose or misplace data. Telecom platforms cannot afford a process that prolongs a system's downtime.
With JFS, in case of a system failure, a filesystem is restored to a consistent state by replaying the log and applying log records for the appropriate transactions. The recovery time associated with this log-based approach is much faster, because the replay utility examines only the log records produced by recent filesystem activity, rather than examining all filesystem metadata.