The World Is a libferris Filesystem
The libferris virtual filesystem always has sought to push the boundaries of what a filesystem should do in terms of what can be mounted and what metadata is available for files. During the past five years, it has expanded its capabilities from mounting more traditional things, such as tar.gz, SSH, digital cameras and IPC primitives, to being able to mount various Indexed Sequential Access Mechanism (ISAM) files, including db4, tdb, edb, eet and gdbm; various relational databases, including odbc, MySQL and PostgreSQL; various servers, such as HTTP, FTP, LDAP, Evolution and RDF graphs; as well as XML files and Sleepycat's dbXML.
Recently, support for indexing filesystem data using any combination of Lucene, ODBC, TSearch2, xapian, LDAP, PostgreSQL and Web search has been added with the ability to query these back ends for matching files. Matches naturally are presented as a virtual filesystem. Details of using the index and search capabilities of libferris appeared in the February 2005 issue of Linux Journal in my article “Filesystem Indexing with libferris”. I should mention that anything you see mounted as a filesystem in this article can be indexed and searched for as described in that past article on searching.
You can access your libferris virtual filesystem either by native libferris clients or by exporting libferris through Samba.
The two primary abstractions in libferris are the Context and the Extended Attribute (EA). A Context can be thought of as a superclass of a file or directory. In libferris, there is less of a distinction between a file and a directory with the ability for a file to behave like a directory if it is treated like one. For example, if you try to read a tar.gz file as a directory, libferris automatically mounts the archive as a filesystem and lists the contents of the archive as a virtual filesystem.
The EA interface can be thought of as a similar concept to the Linux kernel's EA interface. That is, arbitrary key-value data is attached to files and directories. This EA concept was extended early on in libferris to allow the value for an attribute to be derived from the content of a file. This means simple things like width and height of an image or video file become first-class metadata citizens along with a file's size and modification time. The limits on what metadata is available extend far beyond image metadata to include XMP, EXIF, music ID tags, Annodex media, geospatial tags, RPM metadata, SELinux integration, partially ordered emblem categories and arbitrary personal RDF stores of metadata.
Having all metadata available through a single interface allows libferris to provide filtering and sorting capabilities on any of that metadata. As such, you can sort a directory by any metadata just as easily as you would use ls -Sh to sort by file size. Sorting on multiple metadata values is also supported in libferris; you can sort your files easily by MIME type, then image width, then modification time—with all three pieces of metadata contributing to the final directory ordering. Any libferris virtual filesystem can have filtering and sorting applied to it to obtain a new libferris virtual filesystem.
You can store EA values into a personal RDF store—for example, when you write an image width to an extended attribute. When you subsequently read the image width, you get the value you just wrote to the EA. This extends naturally to other situations, such as when you change the x or y EA for a window, which should move the window.
Allowing EA to be stored in a personal RDF file lets you add metadata to any libferris object, even those for which you have only read access. For example, you can attach emblems or comments to the Linux Kongress Web site just as you would a normal file.
An interesting EA for all files is the content EA, which is equivalent to the file's byte contents. Exposing the file itself through the EA interface means that any information about a file can be obtained via the same interface.
libferris is written in C++ and provides a standard IOStream interface to both Contexts and EA. Many standard file utilities have been rewritten to take advantage of libferris features. These clients include ls, cp, mv, rm, mkdir, cat, find, touch, IO redirection and more.
As we explore these filesystems, I use the ferrisls command, which mimics the coreutils ls(1) command. As well as the -l long listing option, I use the -0 (zero) recommended-ea option of ferrisls. This operates in much the same way as -l, though it asks the filesystem itself which EAs are most interesting for the user to see. I assume a shell alias of fls=ferrisls in the code examples.
I start by showing interaction with the standard kernel-based filesystems and some of the EA possibilities. Along with the recommended-ea option, ferrisls supports the --xml option to produce an XML document as output. This provides information as to what EA each value belongs and provides one possibility to drive Web interfaces using libferris.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- Designing Electronics with Linux
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Validate an E-Mail Address with PHP, the Right Way
- Tech Tip: Really Simple HTTP Server with Python
- Build a Skype Server for Your Home Phone System
- Why Python?
- A Topic for Discussion - Open Source Feature-Richness?
- Reply to comment | Linux Journal
42 min 47 sec ago
- Not free anymore
4 hours 44 min ago
8 hours 31 min ago
- Reply to comment | Linux Journal
8 hours 39 min ago
- Understanding the Linux Kernel
10 hours 54 min ago
13 hours 24 min ago
- Kernel Problem
23 hours 27 min ago
- BASH script to log IPs on public web server
1 day 3 hours ago
1 day 7 hours ago
- Reply to comment | Linux Journal
1 day 8 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?