Finding Files and More

 in
All about the find command.
Pick and Choose

Criteria allow you to select files.

Each file has access, status, and modification times, and find provides three time-based criteria, one for each of these values. They can be checked in increments of days or minutes, and files can be compared based on these times.

The modification time is set every time the file's contents are changed.

$ find . -mtime +10 -print

will print out files that have not been modified in the past ten days, similar to our second example.

In the previous example we used the plus sign to signify “more than.” In addition to this, find also supports the minus sign to indicate less than.

$ find / -mtime -5 -print

will print out files that were accessed less than 5 days ago. The absence of these operators will cause find to choose exact matches. As mentioned before, the -daystart option will modify the search so that the dates are based on the most recent midnight instead of 24 hours before now.

To use minutes instead of days, use the -mmin criterion.

$ find . -mmin +10 -print

will output files that have been modified more than ten minutes ago.

The -newer criterion

$ find . -newer document -print

will output files that have been modified more recently than document.

The command sets both the access and modification times on files. If the file does not exist, it will be created. We can use it for an example.

$ touch foo

will create a file named “foo” in the current directory, if there isn't already one there. Now,

$ find -mmin 1 -print

should output foo, but

$ find -mmin 2 -print

should not.

For access time, which indicates the last time the files were opened, find has similar options. For days there is -atime, for minutes -amin and for comparisons -anewer.

Status time initially indicates creation time, and then follows any modifications to the file or its inode. It can be used with -ctime, -cmin, and -cnewer. These criteria match files based on the last time a file's ownership, access mode, or other characteristics have been changed.

Find also has a -used option. It will match files that have been accessed since their status was last changed:

find -used +2

will find files that have been used more than two days since their status was last changed.

I've mentioned file modes a few time throughout this article. File modes express which users may perform certain operations on a file, what type of file it is and also some other information about the file. find allows us to match files based on their mode.

Before I go over these options, I will explain file modes and how they are displayed and set.

Users most commonly come in contact with file modes when they concern file ownership and access. A file belongs to an owner and a group, therefore it follows that access is controlled with respect to three entities: owner, group and world. (“World” is made up of users that are not the owner and do not belong to the affiliated group.)

Access is controlled with respect to three actions: Reading, writing (which includes deletion) and execution. Let's look at the output of a long listing with ls.

$ ls -l foo
-rw-rw-r-- 1 eric staff  0 Sep  6 22:55 foo

(I've deleted some of the spaces ls normally creates in order to fit the entire output.) The leftmost column of the output has ten characters that show use foo's mode and file type. From the left, the first is used by ls to show us the type of file. For example, if it were a link or directory we would see an l or d there.

The remaining nine characters show us the access mode. In groups of three, the show us the rights for owner, group, and world, in that order. Each triplet has a field for read r, write w and execute x.

$ chmod 777 foo
$ ls -l foo
-rwxrwxrwx 1 eric staff  0 Sep  6 22:55 foo

We have turned on all permissions for all users on the file “foo”.

The chmod command can use two different kinds of notation, symbolic and octal. While symbolic notation is easier to remember for most people, I used octal notation, because it is the format for modes that find expects. With this notation each number represents the octal permissions for each user class.

The permissions are calculated by adding the following:

  • 4 Read

  • 2 Write

  • 1 Execute

So if you want to give the owner of a file full permissions and group and world only read and execute permissions, you want to “set” all bits for owner, and the read and execute bits for the others:

Owner = 4 + 2 + 1 = 7
Group = 4 + 1     = 5
World = 4 + 1     = 5

So the command would be:

$ chmod 755 program
$ ls -l program
-rwxr-xr-x 1 eric  staff 106410 Sep  6 22:55 program

The listing shows the mode we expected.

Back to find: the -perm criterion accepts this type of notation.

$ find . -perm 777 -print

would find all of the files in and under the current directory that have read, write and execute permissions set for all users.

The -perm option also supports the + and - operators.

$ find . -perm +600 -print

would output any files that are readable or writable by their owner.

$ find . -perm -600 -print

would output any files that are readable and writable by their owner.

Therefore the + acts as a boolean “or” and the - acts as a boolean “and”.

The ability to find files based on their permissions is an important security tool. Later, I will cover some important special file modes, and how find can help protect a system from attacks that use them.

File size is another option offered by find. File sizes may be specified in 512 byte blocks, two byte words, kilobytes or just bytes. Since size is a numeric option + and - are also supported.

$ find . -size +4096k -print

will print the names of any files larger than four megabytes.

$ find . -size -1c -print

will print the names of any files smaller than one byte. The -empty option also matches empty files.

For 512 byte blocks the number should be followed by a “b”, for 2 byte words a “w”.

There is one caveat when searching for files by size. Some files, such as /var/adm/lastlog, have more space allocated than they actually use. These files are known as “sparse” or “holey” files. Like ls, find will report these files by the space they have allocated, not the space they are actually using. If you have any doubt about how much space a file is using, use the du command.

$ ls -l /var/adm/lastlog

reports a size of 16032 (15k) on my system;

$ du -k /var/adm/lastlog

reports only 3k.

Our first example showed us how to find a file when we know the exact name. Find will also accept the * wildcard, but the file name must then be quoted in order to prevent the shell from expanding the file name before passing it to find.

$ find / -name "*gif" -print

will output all of the files ending in “gif” on the entire system.

In addition to simple wildcards, find also supports regular expressions with the -regex option.

$ find . -regex './[0-9].*' -print

will locate any files in the current directory that begin with a number. Note that the regular expression is applied to the entire path, which makes the expression rather difficult to write. For more information about regular expressions see the man pages for grep or the article in the October issue of Linux Journal.

Another search criterion is file type.

$ find / -type d -print

will list all of the directories. Here is a list of the file types and the appropriate letter to use to search for them.

  • b block special files such as a disk device.

  • c character special files such as a terminal device.

  • d directory

  • p named pipe

  • f regular file

  • l symbolic (soft) link

  • s socket

If you are unfamiliar with any of these file types, don't worry. You can learn as you go.

Files can also be matched by user of group id. As demonstrated earlier,

$ find . -uid 0 -print

will output all files belonging to root.

$ find . -uid 120 -print

will output all files belonging to the user with UID 120.

To make things easier,

$ find -user eric -print

will output all files belonging to eric.

Find also has similar options for groups: -gid and -group.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState