A Non-Technical Look inside the EXT2 File System

How the EXT2 file system is organized on the disk and how it gets its speed.

Everyone wants a fast computer; however, not everyone realizes that one of the most important factors of computer performance is the speed of the file system. Regardless of how fast your CPU is, if the file system is slow, then the whole computer will seem slow. Many people who have very fast Pentium Pros with slow disk drives and even slower networked file systems rediscover this fact daily.

Linux has a very fast file system called the Extended File System Version 2 (EXT2). The EXT2 file system was created by Remy Card (card@masi.ibp.fr).

Disk Layout

There are several objectives when deciding how to lay out data on a disk.

First and foremost, the data structure should be recoverable. If there is an error while writing data to the disk (like a user pulling the power cord), the entire file system should not be lost. Although losing the data currently being written is sometimes acceptable, losing all the data on the disk is not.

Second, the data structure must allow for an efficient implementation of all needed operations. The hardest operation to implement is normally the hard link. When using a hard link, there is more than one directory entry (i.e., file name) that points to the same file data. Accessing the data by any of the valid file names should produce the same data.

Another hard operation involves deleting an open file. If an application has a file open for access at the same time that a user deletes the file, the application should still be able to access the file's data. The data should not be cleared off the disk until the last application closes the file. This sort of behavior is quite unlike DOS/Windows, where deleting a file results in immediate loss of access to that file by any application in the process of reading/writing to it. Applications exhibiting this type of Unix behavior are more common than one might think, and changing it would cause many applications to break.

Third, a disk layout should minimize seek times by clustering data on the disk. A drive needs more time to read two pieces of data that are widely separated on the disk than the same sized pieces located close to each other. A good disk layout can minimize disk seek time (and maximize performance) by clustering related data close together. For example, parts of the same file should be close together on disk and, also, near the directory containing the file's name.

Finally, the disk layout should conserve disk space. Conserving disk space was more important in the past, when hard drives were small and expensive. These days, conserving disk space is not so important; however, one should not waste disk space.

Partitions

Partitions are the first level of disk layout. Each disk must have one or more partitions. The operating system pretends each partition is a separate logical disk, even though they may share the same physical disk. The most common use of partitioning is to place more than one file system on the same physical disk, each in its own partition. Each partition has its own device file in the /dev directory (e.g., /dev/hda1, /dev/hda2, etc.). Every EXT2 file system occupies one partition, and completely fills it.

Groups

The EXT2 file system is divided into groups, which are sections of a partition. The division into groups is done at the time the file system is formatted and cannot change without reformatting. Each group contains related data. A group is the unit of clustering in the EXT2 file system. Each group contains a superblock, a group descriptor, a block bitmap, an inode bitmap, an inode table and finally data blocks, all in that order.

Superblock

Some information about a file system belongs to the file system as a whole and not to any particular file or group. This information is stored in the superblock, and includes the total number of blocks within the file system, the time it was last checked for errors and so on.

The first superblock is the most important one, since it is the first one read when the file system is mounted. The information in the superblock is so important that the file system cannot be mounted without it. If a disk error occurred while updating the superblock, the entire file system would be ruined; therefore, a copy of the superblock is kept in each group. If the first superblock becomes corrupted, the redundant copies can be used to fix the error by using the command e2fsck.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState