Kernel Korner - Unionfs: Bringing Filesystems Together
Now home directories from both /alcid and /penguin are available from /home.
Unionfs supports multiple read-write branches, so the user's files will not migrate from one directory to another. This contrasts with previous unioning systems, such as BSD-4.4's Union Mounts, which generally supported only a single read-write branch.
Most Linux distributions are available as both ISO images and also as individual packages. ISO images are convenient because they can be burnt directly to CD-ROMs, and you need to download and store only a few files. To install to a machine over a network, however, you often need to have the individual packages in a single directory. Using the loopback device, the ISO images can be mounted on separate directories, but this layout is not suitable for a network install because all files need to be located in a single tree. For this reason, many sites maintain copies of both the ISO images and also the individual package files, wasting both disk space and bandwidth and increasing management efforts. Unionfs can alleviate this problem by virtually combining the individual package directories from the ISO images.
In this example, we are mounting over two directories, /mnt/disc1 and /mnt/disc2. The mount command is as follows:
# mount -t unionfs -o dirs=/mnt/disc1,/mnt/disc2 \ > none /mnt/merged-distribution
In the previous example of the ISO images, all of the branches in the union were read-only; hence, the union itself was read-only. Unionfs also can mix read-only and read-write branches. In this case, the union as a whole is read-write, and Unionfs uses copy-on-write semantics to give the illusion that you can modify files and directories on read-only branches. This could be used to patch a CD-ROM. If the CD-ROM is mounted on /mnt/cdrom, and an empty directory is created in /tmp/cdpatch, then Unionfs can be mounted as follows:
# mount -t unionfs -o dirs=/tmp/cdpatch,/mnt/cdrom \ > none /mnt/patched-cdrom
When viewed through /mnt/patched-cdrom, it appears as though you can write to the CD-ROM, but all writes actually will take place in /tmp/cdpatch. Writing to read-only branches results in an operation called a copyup. When a read-only file is opened for writing, the file is copied over to a higher-priority branch. If required, Unionfs automatically creates any needed parent directory hierarchy.
In this CD-ROM example, the lower-level filesystem enforces the read-only permissions, and Unionfs respects them. In other situations, the lower-level filesystem may indeed be read-write, but Unionfs should not modify the branch. For example, you may have one branch that contains pristine kernel sources and then use a separate branch for your local changes. Through Unionfs, the pristine sources should be read-only, as the CD-ROM was in the previous example. This can be accomplished by adding =ro to a directory in the dirs mount option. Assume that /home/cpw/linux is empty, and /usr/src/linux contains a Linux kernel source tree. The following mount command makes Unionfs behave as a source code versioning system:
# mount -t unionfs -o \ > dirs=/home/cpw/linux:/usr/src/linux=ro \ > none /home/cpw/linux-src
This example makes it appear as if an entire Linux source tree exists in /home/cpw/linux-src, but any changes to that source tree, such as changed source files or new object files, actually go to /home/cpw/linux.
With a simple modification, we also can use an overlay mount. That is, we can replace /home/cpw/ linux with the unified view:
# mount -t unionfs -o > dirs=/home/cpw/linux:/usr/src/linux=ro > none /home/cpw/linux
Most filesystem operations in Unionfs move from higher-priority branches to lower-priority branches. For example, LOOKUP begins in the highest-priority branch in which the parent exists and then moves to lower-priority branches. During the lookup operation, Unionfs caches information for use in later operations.
CREATE attempts to create files in the highest-priority branch where the parent directory exists. The CREATE operation uses the cached lookup information to operate directly on the appropriate branch, so in effect, it moves from higher-priority branches to lower-priority branches.
Unionfs uses several techniques to provide the illusion of modifying read-only branches, and at the same time, maintains normal UNIX semantics. If there is an error while creating a file, error handling must be performed. Error handling proceeds from lower-priority branches to higher-priority branches. Starting in the highest-priority branch in which the parent exists, Unionfs attempts to create the file in each higher-priority branch. Finally, if the operation fails in the highest-priority branch for the whole union, then Unionfs returns an error to the user.
In contrast to CREATE, the UNLINK operation always proceeds from lower-priority branches to higher-priority branches. Because the last underlying object to be UNLINKed is the highest-priority object, the user-visible state is not modified until the very end of Unionfs's UNLINK operation. The most complex situations to handle are partial errors. If removing an intermediate file fails and Unionfs simply removes the highest-priority file, a lower-priority file becomes visible to the user. To handle these error conditions, Unionfs uses a special high-priority file called a whiteout. If Unionfs encounters a whiteout, it behaves as if the file does not exist in any lower-priority branch. Internally, to create a whiteout for a file named F, Unionfs creates a zero-length file named .wh.F. Getting back to UNLINK—if an intermediate UNLINK has failed, instead of deleting the highest-priority file, Unionfs renames the file to the corresponding whiteout name.
This careful ordering of operations has two effects. The first is that UNIX semantics are maintained even in the face of errors or read-only branches. The user-visible state isn't modified until the operation is attempted on the highest-priority branch. The success or failure of the operation is determined by the success or failure of this branch. Through the use of whiteouts, a file can be deleted even if it exists on a read-only branch. The second effect is that when no errors occur, files and directories tend to stay in the branches where they were originally. This is important because one of the goals of Unionfs is to keep the files in separate places.
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Non-Linux FOSS: libnotify, OS X Style
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- RSS Feeds
- Reply to comment | Linux Journal
3 hours 2 min ago
- Yeah, user namespaces are
4 hours 19 min ago
- Cari Uang
7 hours 50 min ago
- user namespaces
10 hours 43 min ago
11 hours 9 min ago
- One advantage with VMs
13 hours 38 min ago
- about info
14 hours 11 min ago
14 hours 12 min ago
14 hours 13 min ago
14 hours 15 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?