Finding Files and More
Not long after getting their first Linux system, new users usually need to locate a file somewhere on their system. So they learn the following command from a friend, or maybe a book or magazine:
$ find / -name filename -print
Now while this command does work perfectly fine, the syntax does seem awkward to people unfamiliar with the find command. Why should we have to specify print? [Note: On Linux systems, and other systems that use GNU find, we don't. But standard Unix find insists on it, so you might as well get used to it if you use Unix as well as Linux.]
For that matter, why should we have to specify name? Why not just find filename? It's this seemingly cryptic structure that makes find one of the most under used commands in the Unix toolbox.
A look at the find man page (on any system, not just Linux) completes the confusing picture. For someone not familiar with Unix, find's “operators” and “expressions” make it an awfully complicated program just for locating files.
If all you want to do is locate a file, there is a better way to do that:
locate filename
This will work on a properly set-up Linux system with GNU find. Why have a complicated command like find when we already have a simple command like locate? Because find is good for much more than just finding files. (Good Linux distributions some with update properly set up. If yours isn't, you can run updatedb as root to update the database it uses, or simply use find as shown above).
The Caldera/Redhat system that I use at home has several entries in the crontab that run this command:
find /tmp/* -atime +10 -exec rm -f {} \;
This command deletes any files in /tmp that haven't been accessed in the past ten days. The fact that find only deletes files that haven't been accessed in the past ten days rather than files that were created that long ago is a subtle, but very important point. Find gives us access to the very valuable set of information stored about files and directories in Unix filesystems.
Like most Unix filesystems, the second extended filesystem (“ext2”) that is used on most Linux systems stores a more extensive set of data about files than just their name, size and last-change-date the way systems such as DOS do. It also stores an owner and group, access mode, the dates that the file was last modified and accessed, the date that the file last changed status, and the type. (Don't worry, we'll explain these as we go).
With the exception of the names, all this information is stored for each file and directory in a structure called an inode. In Unix filesystems, directories are simply files that contain a list of filenames with inode numbers.

Table 1 has a list of inode entry fields and how they are “translated” for the different filesystem types supported by Linux. While this table may not mean much to you yet, it should be self-explanatory by the time you finish reading this article.
Let's analyze the find command line:
find starting-point options criteria action
starting-point One or more directories from which to start searching. The default is the current directory.
options Modify the methods used for searching in several ways.
criteria Specify which files are chosen, and which are ignored. All files found are chosen by default.
action What to do with the files that are chosen. GNU find has a default action of -print, but standard Unix find has no default action, and will abort and complain unless an action is explicitly provided.
The starting-point parameter has two effects on find's actions. The most obvious is that it specifies in which directory (or directories; there can be more than one starting point) to start looking for files. The other effect is on how the chosen filenames are treated, as this example shows:
$ cd /usr/X11/man $ find man5 -print man5 man5/XF86Config.5x man5/pbm.5 man5/pgm.5 man5/pnm.5 man5/ppm.5 $ find /usr/X11/man/man5 -print /usr/X11/man/man5 /usr/X11/man/man5/XF86Config.5x /usr/X11/man/man5/pbm.5 /usr/X11/man/man5/pgm.5 /usr/X11/man/man5/pnm.5 /usr/X11/man/man5/ppm.5
When a user is simply looking for a file, this difference in behavior does not matter very much. But when you want to use the output from find to drive another program, it can be very important, depending on the program being driven.
In addition to the starting point, we have control over some other aspects of find's behavior, such as how it should handle soft links, how to evaluate file timestamps and how deep to follow directory structures. These are controlled by options.
The -follow option tells find to follow soft (or symbolic) links to the actual file. A soft link is a file that “points” to another file. To demonstrate this option, create (as a normal user, not as root) a soft link with ln in your home directory that points to file that belongs to root.
$ cd $ ln -s /vmlinuz ./kernel
Now use ls to produce a long listing for the file.
$ ls -l kernel lrwxrwxrwx ... kernel -> /vmlinuz
The first column of the mode, l, tells us it is a soft link. We also are told what file the link “points” to.
Now let's demonstrate the effect of find's -follow option by searching through the directory for files belonging to root, using it. (uid 0 is root; we'll cover the -uid option in more detail later.)
$ find . -uid 0 -print
nothing is printed
$ find . -follow -uid 0 -print
./kernel
You created the link to the kernel, so you own the link, called ./kernel. But the file /vmlinuz is owned by root.
The -daystart option modifies the behavior of find when it comes to evaluating time. When -daystart is specified, find will measure days from the beginning of the day instead of from 24 hours ago. (We will cover the parameters related to time later.)
Frequently a user will need to find a file that he or she knows is somewhere on local hard disk, and not on a mounted cdrom or network volume. An easy way to keep find from straying to these other disks is with the -xdev option.
$ find / -name document -print
will cause find to search for the file “document” in every directory under /, which can be very slow with a CDROM or network filesystem mounted.
$ find / -xdev -name document -print
will instead cause find to limit its search to the device that / is mounted on. (An alias for -xdev is -mount) Of course, if you have more than one local filesystem, you will need to execute a different search for it. Perhaps
$ find / /usr -xdev -name document -print
if you have two partitions, one for / and one for /usr. Alternately, you can say
$ find / -fstype ext2 -name document -print
if all your local partitions are ext2 filesystems.
Another way to save time on searches is to use the options related to directory depth.
$ find /usr -maxdepth 4 -name document -print
will limit find's search for document to directories four level deep or less “under” /usr.
Another option related to directory depth is -depth, which causes the directories to be selected before any files in them. We'll see later why this is useful.
The -noleaf option is used for searching filesystems that aren't Unix-like. Table 1 tells for which filesystems specifying -noleaf may speed up your search.
We already had an example of finding a file by name. Other mechanisms for matching filenames are -path, which matches by directory name, -iname, which is similar to -name but case insensitive, and -ipath, which is also case insensitive.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




46 min 2 sec ago
7 hours 40 min ago
7 hours 56 min ago
9 hours 47 min ago
15 hours 39 min ago
20 hours 10 min ago
20 hours 11 min ago
22 hours 11 min ago
1 day 6 hours ago
1 day 7 hours ago