A Non-Technical Look inside the EXT2 File System
The easiest way to understand the EXT2 file system is to watch it in action.
To explain the EXT2 file system in action, we will need two things: a variable named DIR that holds directories, and a path name to look up. Some path names have many components (e.g., /usr/X11/bin/Xrefresh) and others do not (e.g., /vmlinuz).
Assume that a process wants to open a file. Each process is associated with a current working directory. All file names that do not start with “/” are resolved relative to this current working directory and DIR begins with the current working directory. File names that start with “/” are resolved relative to the root directory (see chroot for the one exception), and DIR begins with the root directory.
Each directory name in the path to be resolved is looked up in DIR in turn. This lookup yields the inode number of the subdirectory we're interested in.
Next, the inode of the subdirectory is accessed. The permissions are checked, and if you have access permissions, this new directory becomes DIR. Each subdirectory in the path is treated in this fashion, until only the last component of the path remains.
When the last component of the pathname is reached, the variable DIR contains the directory which contains the file name we've been searching for. Looking in DIR we find the inode number of the file. Accessing this final inode tells us the location of the data. After checking permissions, you can access the data.
How many disk accesses were needed to access the data you wanted? A reasonable maximum is two per subdirectory (one to look up the name, the other to find the inode) and two more for the actual file name. This effort is expended only at file open time. After a file has been opened, subsequent accesses can use the inode's data without looking it up again. Further, caching eliminates many of the accesses needed to look up a file (more later).
When a new file or directory is created, the EXT2 file system must decide where to store the data. If the disk is mostly empty, data can be stored almost anywhere. However, performance is maximized if the data are clustered with other related data to minimize seek times.
The EXT2 file system attempts to allocate each new directory in the group containing its parent directory, on the theory that accesses to parent and children directories will be closely related. The EXT2 file system also attempts to place files in the same group as their directory entries, because directory accesses often lead to file accesses. However, if the group is full, the new file or new directory is placed in some other non-full group.
The data blocks needed to store directories and files can be found by looking in the data allocation bitmap. Any needed space in the inode table can be found by looking in the inode allocation bitmap.
Like most file systems, the EXT2 system relies very heavily on caching. A cache is a part of RAM dedicated to holding file system data. The cache holds directory information inode information and actual file contents. Whenever an application (like a text editor or a compiler) tries to look up a file name or requests file data, the EXT2 system first checks the cache. If the answer can be found in the cache, the request can be answered very quickly indeed without using the disk.
The cache is filled with data from prior requests. If you request data that you have never requested before, the data must first be retrieved from disk. Most of the time most people ask for data they have used before. These repeat requests are answered quickly from the cache, saving the disk drive much effort while providing the user quick access.
Of course, each computer has a limited amount of RAM available. Most of that RAM is used for other things like running applications, leaving perhaps 10% to 30% of total RAM available for the cache. When the cache becomes full, the oldest unused data (least recently used data) is thrown out. Only recently used data remains in the cache.
Since larger caches can hold more data, they can also satisfy a larger number of requests. The figure below shows a typical curve of the total cache size versus the percent of all requests that can be satisfied from the cache. As you can see in Figure 3, using more RAM for caching increases the number of requests answered from the cache, and therefore increases the apparent speed of the file system.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- ServersCheck's Thermal Imaging Camera Sensor
- The Italian Army Switches to LibreOffice
- Linux Mint 18
- Petros Koutoupis' RapidDisk
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Privacy and the New Math
- Ben Rady's Serverless Single Page Apps (The Pragmatic Programmers)
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide