Implementing a deltree Command in Linux
Ever needed to excise a large software package from your file space only to discover it dispersed over a directory tree containing over a hundred files and one tenth as many subdirectories? The command rm -rf will clear everything away nicely. However, in order to learn more about walking a Linux directory tree, let's look at implementing rm -r as a home-brew (DOS-like) deltree command in Linux is not difficult and will make it easier to remove unused or unwanted software packages if you own the utility and use it in your own file space.
To select C library resources to complete this task, we first determine which resources are generally available in UNIX and then which of these resources Linux implements. Two UNIX functions contained in the header file ftw.h walk a directory tree:
int ftw ( const char * path, int (*funcptr ) ( const char *, const struct stat*, int ), int depth )
int nftw (const char * path, int (*funcptr) (const char *, const struct stat*, int, struct FTW*), int depth, int flag )Linux does not implement the second function, so we turn to the first; ftw walks a directory tree from top to bottom. For each directory entry, ftw calls the function pointed to by funcptr with the name of the entry, a pointer to a stat structure containing inode information and a flag set to convey information about the directory entry in question:
FTW_F: a file
FTW_D: a directory
FTW_DNR: a non-readable directory
FTW_NS: stat failed and inode information is not available.
In DelEntry(), the C library function
int remove ( const char * path )
in stdio.h does the actual work. This function returns 0 if successful, -1 if unsuccessful and sets the global variable errno as necessary to handle a number of different error conditions, which the Linux man pages explain in detail.
There is, however, a catch. In UNIX, remove may generally be used to delete either files or empty directories. In Linux, remove only processes files and would therefore empty the directory tree but leave the tree itself standing. Two UNIX C library functions may provide solutions:
int rmdirp ( char * d, char *d1 ) int rmdir ( const char * path )
Linux does not implement the first UNIX function rmdirp, so we focus on the second. The function rmdir removes only empty directories and returns 0 on success, -1 otherwise with errno set. To accomplish our task, we must walk the tree twice: once from the top down to the directory at the bottom, deleting files as we go, and the second time from the bottom back to the top, removing empty directories in reverse order. The perfect tool to achieve this result is a container class: a stack of pointers to directory path names.
When ftw calls DelEntry, it supplies a flag indicating whether it found a file, a directory or cannot cope, and we can use this flag to fill StrStack with path names inside DelEntry as ftw walks the tree. The question is where to put the stack. The header file ftw.h specifies the signatures of the ftw function and the function pointed to by funcptr, and neither signature includes a stack, so we cannot pass the stack in by reference as a parameter to DelEntry. The simplest solution is to create StrStack as an external variable in the implementation file funcs.cpp which holds the function definitions for main. As an external variable, StrStack will be equally accessible to DelEntry and to DelDirectories, provided it is defined in the implementation file above these functions.
Several aspects of StrStack require explanation. StrStack differs from the average stack in that each node contains pointers to two different, dynamically allocated structures: a pointer to the next node and a pointer to character strings of varying lengths. Two allocations are necessary to create a node, and two separate deallocations are necessary to destroy a node. By making StrStack responsible for both allocations, the code is more reliable, more robust and has no memory leaks. In addition, if the caller was responsible for allocating and deallocating memory containing character arrays, then exocode could literally pull data out from under StrStack, leaving dangling pointers.
In some places, implicit recursion accomplishes allocation and deallocation, and it may not be obvious at first glance how the process works. Let's examine the copy constructor for StrStack shown in Listing 2.
The class copy constructor is designed to create a copy of a StrStack object in case a function ever passes a stack in or out by value. The code in the function is obvious except for one line:
next_ = new strNode ( *srcnode.next_ ); // indirect recursion
This line is an example of indirect recursion, and it duplicates all the nodes in the StrStack node sequence. How does it work? The argument to new is: strNode ( *srcnode.next_ ) which is another call to the node copy constructor with the next node in sequence as argument. As long as each node contains a pointer to another node, the copy constructor repeatedly calls itself recursively until it encounters a NULL in the next_ field of the last node in the sequence. With that, the recursion ceases and begins to unwind, constructing a copy backwards from the tail of the node sequence to the head. Note that the copy constructor deals, as promised, with two different dynamic allocations: allocating memory for the node, and then for the character array which holds the path name. In the node destructor, the line delete next_ again triggers a sequence of implicit recursions which result in the destructor calling itself until the final NULL at the end of the list is encountered. At that point, the recursion unwinds, and nodes are deleted from the tail of the list back to the head.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- What's the tweeting protocol?
- Tech Tip: Really Simple HTTP Server with Python
- BASH script to log IPs on public web server
3 hours 53 min ago
7 hours 29 min ago
- Reply to comment | Linux Journal
8 hours 1 min ago
- All the articles you talked
10 hours 25 min ago
- All the articles you talked
10 hours 28 min ago
- All the articles you talked
10 hours 30 min ago
14 hours 54 min ago
- Keeping track of IP address
16 hours 45 min ago
- Roll your own dynamic dns
21 hours 59 min ago
- Please correct the URL for Salt Stack's web site
1 day 1 hour ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?