Talking Point: Could Linux Abandon Directories In Favour Of Tagging?
For a fairly scruffy looking guy, I have a surprisingly healthy approach to organising my files. However, I'm constantly pushing up against the limitations of a system that is based around directories. I'm convinced that Linux needs to make greater use of tagging, but I'm also beginning to wonder if desktop Linux could abandon the hierarchical directory structure entirely.
Why is it that web based technology such online bookmarking makes far greater use of tagging than the Linux desktop does? Directories for files are based on the way that humans have always organised items in the real world, using categories and sub categories. Thanks to powerful computers and cheap, plentiful storage, tagging now offers a method of storage that isn't based on placing files in one place or another.
The word processor file that makes up this article is stored /documents/articles/linux_journal/ but it could be even more efficiently organised if I could easily tag it as “documents”, “articles”, “linux journal” as well as “op ed”, “daft ideas”, “tagging”, “linux” and “web posts”. That way I could find it by browsing through alll of the web posts I've made this year or all of the op-ed peices I've ever written.
Some organisational situations illustrate the weakness of the hierarchical approach. For example, if I download some independent electronic dance music, where do I place it within a hierarchical system file system? Does it go in /mp3/dance/electronica/independent or /mp3/independent/electronica/dance? Which system works best depends on whether the significant factor is that it is electronica or independently produced. This is where tagging comes into its own as it allows objects to be placed in more than one category at once.
When dealing with files, there's a distinction to be made between the files that I normally care about and those that I only care about when I'm fiddling around inside Linux's innards. The default setup of most Linux distributions acknowledges this distinction as the files are stored either:
- outside of the /home directory (files that I don't care about most of the time)
- inside the /home directory but hidden (more files that I don't care about most of the time)
- inside the /home directory and visible (these are the files that I care about)
It's this last category of files that is ripe for being moved over to a tagged system. Abandoning the directory system outside of the /home folder would mean not only designing a new operating system but also designing a new set of applications.
Application awareness could make tagging more useful, because as it stands, when I'm opening files or saving them, I can't use tagging most of the time. For one thing, application awareness could reduce the tagging workload. A word processor could set the tag of a file as a “text document” and perhaps offer me some pertinent tags from the system tag cloud to go with it. When I download a file within Firefox, I bet that it would be fairly easy for the developers to make it tag the file as “downloaded”. That way it keeps that information when I also decide that it belongs in the “video” and “trailer” “film” “science fiction” “have watched” categories.
Most people probably have a fairly fixed idea of what they think a file browser is, but a large proportion of applications are actually specialised file browsers. Why couldn't a tag-aware file browser suddenly switch into music browsing mode as soon as I select the music file tag? If it automatically switched to the details view, added an extra pane on the left hand side for an album view, gained a time elapsed counter in the status area along with some transport controls, you'd have a fairly good music player. Email clients are also specialised file browsers. In the classic three pane layout, the left area represents the folders, the top right hand pane shows the files, and the bottom right pane is a viewer. Click on the message and it opens a slightly specialised text editor.
Ubiquitous tagging for normal desktop use would be a way for desktop Linux to get ahead of the competition, and I have an idea that it would particularly appeal to people who weren't computer experts. Bear in mind that non-experts don't have any difficulty understanding tagging on the web.
I see the two main barriers to greater adoption of tagging on the desktop as the lack of a unified standard for metadata and the aforementioned lack of application awareness. I wonder which will be the first mainstream distribution or desktop environment to experiment with removing directories and going 100% tagging for end users?
The tagging image used as the icon for this article was created by Salvatore Vuono. Downloaded from Free Digital Photos.
UK based freelance writer Michael Reed writes about technology, retro computing, geek culture and gender politics.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Home, My Backup Data Center
- A Topic for Discussion - Open Source Feature-Richness?
- Developer Poll
- Dart: a New Web Programming Experience
- May 2013 Issue of Linux Journal: Raspberry Pi
- What's the tweeting protocol?
- Reply to comment | Linux Journal
2 hours 37 min ago - Reply to comment | Linux Journal
3 hours 23 min ago - Web Hosting IQ
4 hours 57 min ago - Thanks for taking the time to
6 hours 34 min ago - Linux is good
8 hours 32 min ago - Reply to comment | Linux Journal
8 hours 49 min ago - Web Hosting IQ
9 hours 19 min ago - Web Hosting IQ
9 hours 19 min ago - Web Hosting IQ
9 hours 20 min ago - Reply to comment | Linux Journal
12 hours 21 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Tagistant FUSE filesystem
You may want to investigate the Tagistant project:
http://www.tagsistant.net/index.php
It's a tag-based "semantic" filesystem based on FUSE and Sqlite. I have not had time to use it myself, so I can't attest to how well it functions.
Reminds me of Hoarders
Hoarders: folks who pile things all in one room (or house) until there is no room to walk around. They mentally tagged where each item was but after the passage of time they no longer can find a specific item. When a hoarder's stash is cleared up and a long lost item is discovered the response often is "Oh, I wondered where that was".
Other problems with abandoning directories is that of ownership and security, one of the hallmarks of Linux. It would put a huge burden on a file browser or a shell to show only those files a particular user has the right to see, modify or delete, each file having to be examined in order to decide.
I just used "find . -type f | wc -l" as root to discover that my Kubuntu 10.4 system has a total of 502,862 files! A SOHO or corporate LAN might contain hundreds or thousands of times as many files. Microsoft's "Active Directory" is already a slug when trying to show the files in a directory with only a fraction of that total. Having a Linux file browser scan/sort/display who knows how many millions of files residing in one "directory" (or on one HD) would make AD look like lightening. Obviously a heavily indexed database with fast response times would be necessary.
My "tags" are my sub-directory names, and that paradigm works quite well for me, so I'll pass on what the author is proposing. Besides, as one comment already pointed out, there are apps that already allow for file tagging. No need to change the Linux file hierarchy structure.
sounds like a movement
This sounds to me like a situation where we have tagging enthusiasts who are enthusiastic to the point of wanting to force people to use tags ("why can't the fools see the light?"). Since I don't see the either/or issue in regard to tags/directories, it's hard to see the need to force people to use tags.
I thought that the best plan for deciding where computing/Linux is going is that anything new should supplant something old by the fact that the vast majority abandon the old, not that the fans of the new abolish it. So show me the data on that.
A better way
I don't think that Tags are that useful, using them would imply to have to memorize lots of tags. Since entropy is a fact and you cannot remember your old ideas forever, after 2 years you may also wonder "why did I tag that file that way". Of course the same can happen with a hierarchical structure but the advantage there is that you only need to remember the broader subject and then go into more specific sub-categories by choosing among the options that you created, without having to remember them at all times. We all need to define our own standard about how to categorize things, most OSs have implemented pre-stablished directories like Documents, Videos, Pictures, tmp, etc. But we still need to define by ourselves (perhaps with some expert's advice) how to categorize our files beyond those main categories. One useful tool could be that in addition to the well know Ctrl-C Ctrl-V to copy and paste, an easy Ctrl-C Ctrl-L standar option were made also available in order to copy and paste symbolic links of the original files in many different places if those files fall under many different categories.
"I don't think that Tags are
"I don't think that Tags are that useful, using them would imply to have to memorize lots of tags"
I believe tags are as usefull as you make them, just like directories.
If you tag your music files with 'music' and {artist name}, you do not have to remember those tags, it is only logical that you would tag them so.
Most of the time you apply logic to the naming of directories and subdirectories in which files are categorized.
Logic does not need to be remembered, it is something that comes naturally to us and thus using tags this way I believe can be as efficient as using directories (but one does not need to replace the other).
tagging is heavily suited to the GUI
Okay so you're talking of desktop so it's pretty much a given that it will be GUI-based but tagging and associated searching/finding of files is heavily biased towards the GUI. How do you backup or copy only the 1000 files that are tagged with "client name" using rsync, tar or equivalent?
Also you have a heavy presumption on documents here - productivity ones at that. That's fine but there are a host of other files I have under /home. Finally with a directory structure it's easier to implement permissions based on groups etc. How would you easily restrict access to one bunch of files by group if they were all stored in one directory and tagged only?
Okay so you're talking of
Okay so you're talking of desktop so it's pretty much a given that it will be GUI-based but tagging and associated searching/finding of files is heavily biased towards the GUI. How do you backup or copy only the 1000 files that are tagged with "client name" using rsync, tar or equivalent?
Also you have a heavy presumption on documents here - productivity ones at that. That's fine but there are a host of other files I have under /home. Finally with a directory structure it's easier to implement permissions based on groups etc. How would you easily restrict access to one bunch of files by group if they were all stored in one directory and tagged only?
While I have some reservations, I don't see the issues you raised as being significant. Yes, rsync, tar and other tools would have to be updated to support tags but rsync -t "client name" would be trivial.
Similarly, it should be easy to build a mechanism that allows you to set permissions based on tags. It would be equivalent to allowing you to set multiple groups on a file under the current system and to set permissions based on each group.
Have you tried using xattr?
I think extended attributes will answer some of your problems, check them out. Software like Beagle uses them, but you can use the user namespace for anything you like.
Software is/will be ported to use them. For example, see wget:
https://github.com/wertarbyte/wget/tree/xattrurl
Cheers,
Kalin.
Was about to mention it myself.
I too think xattr + fast indexer is the way to go here.
The challenge as I see it, is getting the indexer enough fast, lightweight and subtle for the user to always run it in the background. Most people I know turn of all indexers, be it beagle, tracker, or Windows Desktop Search, since they suck up a little too much RAM, and thrash the I/O-cache a little too much.
I agree
I realized the strength and flexibility of tagging when first using labels within gmail. I love the idea of having attributes on a file and then searching or sorting by attributes. The current linux file system architecture is has lots of mileage and is very powerful. I love it. Merging these two in an elegant way would be the challenge. Maybe just keeping them separate and adding this type of functionality to an arbitrary folder, say /home/{user}/taggable-data, would be a decent trade off. If I want version control in a directory, I use git (git init....) and use the tool from there. Why not have a similar tool that uses the existing filesystem yet presents a DB view of the directories contents. That tool could be command line app or a file manager plugin that allows you to add a file, set/modify/delete its attributes, and search for a file. Apple has been doing this for some time with they way iPod songs are presented to users either on the computer or the iPod.
dear god no
dear god no
Semantic Desktop
Isn't this what the "Semantic Desktop" is for? In KDE you can tag, rate, comment on your files.
Personally, I see it that tagging too much effort. I have to do more work saving files. Then I have to remember what tags that I used (more work). Keep a good hierarchy keeps my search time low. If I need a broader search, I just use general search tools (find/grep type).
Semantic Desktop
This is what popped into my head too. Granted, I don't use it myself but the author should check it out. Apart from semantic desktop, he could use 'nepomuk' as a search term.
excellent notions, like cross-indexing
Awesome ideas, I like them! Faster, sensible, and more like how we actually work. The traditional FS hierarchy is very limited and restrictive. The notion of file metadata with tags that cross multiple categories is closer to the paper-and-filing-cabinet world-- paper files can live in only one physical location, but can logically be in multiple categories. This is handled with cross-indexes, which assign multiple categories to single files.
why one or the other?
I don't understand why can't both be useful.
I mean, it's not that you have to choose between filesystems and tags. You could do both. I mean, simplify the filesystem hierarchy a bit and store stuff properly tagged.
For example, you could just drop music into the Music folder and tag it accordingly.
One thing to think about would be duplicates. What about duplicates and/or untagged files?
Besides, you could already tag files if you're using Tracker; which permits tagging and all.
I think KDE even has this functionality built in. I wouldn't know; I'm a GNOME user. ;)
It's hard to be free... but I love to struggle. Love isn't asked for; it's just given. Respect isn't asked for; it's earned!
Renich Bon Ciric
http://www.woralelandia.com/
http://www.introbella.com/
tags in filename
You can put tags in the filename (seperated by underscores) and use find to retrieve them.
I wrote a script which prompts for some tags and then fills a directory with symlinks to files matching those tags.
I only use this for certain files which I store in a single flat directory.
It's not directory based but inode based
You can easily hardlink your files wherever you want. If you want to use tags then read the meta data that is included within the files