Finding Files and More

 in
All about the find command.
Pick and Choose

Criteria allow you to select files.

Each file has access, status, and modification times, and find provides three time-based criteria, one for each of these values. They can be checked in increments of days or minutes, and files can be compared based on these times.

The modification time is set every time the file's contents are changed.

$ find . -mtime +10 -print

will print out files that have not been modified in the past ten days, similar to our second example.

In the previous example we used the plus sign to signify “more than.” In addition to this, find also supports the minus sign to indicate less than.

$ find / -mtime -5 -print

will print out files that were accessed less than 5 days ago. The absence of these operators will cause find to choose exact matches. As mentioned before, the -daystart option will modify the search so that the dates are based on the most recent midnight instead of 24 hours before now.

To use minutes instead of days, use the -mmin criterion.

$ find . -mmin +10 -print

will output files that have been modified more than ten minutes ago.

The -newer criterion

$ find . -newer document -print

will output files that have been modified more recently than document.

The command sets both the access and modification times on files. If the file does not exist, it will be created. We can use it for an example.

$ touch foo

will create a file named “foo” in the current directory, if there isn't already one there. Now,

$ find -mmin 1 -print

should output foo, but

$ find -mmin 2 -print

should not.

For access time, which indicates the last time the files were opened, find has similar options. For days there is -atime, for minutes -amin and for comparisons -anewer.

Status time initially indicates creation time, and then follows any modifications to the file or its inode. It can be used with -ctime, -cmin, and -cnewer. These criteria match files based on the last time a file's ownership, access mode, or other characteristics have been changed.

Find also has a -used option. It will match files that have been accessed since their status was last changed:

find -used +2

will find files that have been used more than two days since their status was last changed.

I've mentioned file modes a few time throughout this article. File modes express which users may perform certain operations on a file, what type of file it is and also some other information about the file. find allows us to match files based on their mode.

Before I go over these options, I will explain file modes and how they are displayed and set.

Users most commonly come in contact with file modes when they concern file ownership and access. A file belongs to an owner and a group, therefore it follows that access is controlled with respect to three entities: owner, group and world. (“World” is made up of users that are not the owner and do not belong to the affiliated group.)

Access is controlled with respect to three actions: Reading, writing (which includes deletion) and execution. Let's look at the output of a long listing with ls.

$ ls -l foo
-rw-rw-r-- 1 eric staff  0 Sep  6 22:55 foo

(I've deleted some of the spaces ls normally creates in order to fit the entire output.) The leftmost column of the output has ten characters that show use foo's mode and file type. From the left, the first is used by ls to show us the type of file. For example, if it were a link or directory we would see an l or d there.

The remaining nine characters show us the access mode. In groups of three, the show us the rights for owner, group, and world, in that order. Each triplet has a field for read r, write w and execute x.

$ chmod 777 foo
$ ls -l foo
-rwxrwxrwx 1 eric staff  0 Sep  6 22:55 foo

We have turned on all permissions for all users on the file “foo”.

The chmod command can use two different kinds of notation, symbolic and octal. While symbolic notation is easier to remember for most people, I used octal notation, because it is the format for modes that find expects. With this notation each number represents the octal permissions for each user class.

The permissions are calculated by adding the following:

  • 4 Read

  • 2 Write

  • 1 Execute

So if you want to give the owner of a file full permissions and group and world only read and execute permissions, you want to “set” all bits for owner, and the read and execute bits for the others:

Owner = 4 + 2 + 1 = 7
Group = 4 + 1     = 5
World = 4 + 1     = 5

So the command would be:

$ chmod 755 program
$ ls -l program
-rwxr-xr-x 1 eric  staff 106410 Sep  6 22:55 program

The listing shows the mode we expected.

Back to find: the -perm criterion accepts this type of notation.

$ find . -perm 777 -print

would find all of the files in and under the current directory that have read, write and execute permissions set for all users.

The -perm option also supports the + and - operators.

$ find . -perm +600 -print

would output any files that are readable or writable by their owner.

$ find . -perm -600 -print

would output any files that are readable and writable by their owner.

Therefore the + acts as a boolean “or” and the - acts as a boolean “and”.

The ability to find files based on their permissions is an important security tool. Later, I will cover some important special file modes, and how find can help protect a system from attacks that use them.

File size is another option offered by find. File sizes may be specified in 512 byte blocks, two byte words, kilobytes or just bytes. Since size is a numeric option + and - are also supported.

$ find . -size +4096k -print

will print the names of any files larger than four megabytes.

$ find . -size -1c -print

will print the names of any files smaller than one byte. The -empty option also matches empty files.

For 512 byte blocks the number should be followed by a “b”, for 2 byte words a “w”.

There is one caveat when searching for files by size. Some files, such as /var/adm/lastlog, have more space allocated than they actually use. These files are known as “sparse” or “holey” files. Like ls, find will report these files by the space they have allocated, not the space they are actually using. If you have any doubt about how much space a file is using, use the du command.

$ ls -l /var/adm/lastlog

reports a size of 16032 (15k) on my system;

$ du -k /var/adm/lastlog

reports only 3k.

Our first example showed us how to find a file when we know the exact name. Find will also accept the * wildcard, but the file name must then be quoted in order to prevent the shell from expanding the file name before passing it to find.

$ find / -name "*gif" -print

will output all of the files ending in “gif” on the entire system.

In addition to simple wildcards, find also supports regular expressions with the -regex option.

$ find . -regex './[0-9].*' -print

will locate any files in the current directory that begin with a number. Note that the regular expression is applied to the entire path, which makes the expression rather difficult to write. For more information about regular expressions see the man pages for grep or the article in the October issue of Linux Journal.

Another search criterion is file type.

$ find / -type d -print

will list all of the directories. Here is a list of the file types and the appropriate letter to use to search for them.

  • b block special files such as a disk device.

  • c character special files such as a terminal device.

  • d directory

  • p named pipe

  • f regular file

  • l symbolic (soft) link

  • s socket

If you are unfamiliar with any of these file types, don't worry. You can learn as you go.

Files can also be matched by user of group id. As demonstrated earlier,

$ find . -uid 0 -print

will output all files belonging to root.

$ find . -uid 120 -print

will output all files belonging to the user with UID 120.

To make things easier,

$ find -user eric -print

will output all files belonging to eric.

Find also has similar options for groups: -gid and -group.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix