find is admittedly tricky. Once you get a handle on its abilities, you'll learn to appreciate its trickiness. But before thinking about anything remotely tricky, let's look at a simple find command:
%find . -name "*.c" -print
The .
tells find to start its search in
the current directory (.
) (1.21),
and to search all subdirectories of the current
directory.
The
-name "*.c"
(17.4)
tells find to find
files whose names end in .c.
The -print operator tells find how to handle
what it finds: print the names on standard output.
All find commands, no matter how complicated, are really just variations on the one above. You can specify many different names, look for old files, and so on; no matter how complex, you're really only specifying a starting point, some search parameters, and what to do with the files (or directories or links or...) you find.
The key to using find in a more sophisticated way is realizing that search parameters are really "logical expressions" that find evaluates. That is, find:
Looks at every file, one at a time.
Uses the information in the file's inode (1.22) to evaluate an expression given by the command-line operators.
Takes the specified action (e.g., printing the file's name) if the expression's value is "true."
So, something like
-name <">*.c<">
is really a logical expression
that evaluates to true if the file's name ends in .c.
Once you've gotten used to thinking this way, it's easy to use the AND, OR, NOT, and grouping operators. So let's think about a more complicated find command. Let's look for files that end in .o or .tmp AND that are more than five days old, AND print their pathnames. We want an expression that evaluates true for files whose names match either *.o OR *.tmp:
-name "*.o" -o -name "*.tmp"
If either condition is true, we want to check the access time. So we put the expression above within parentheses (quoted (8.14) with backslashes so the shell doesn't treat the parentheses as subshell operators (13.7)). We also add a -atime operator (17.5):
-atime +5 \( -name "*.o" -o -name "*.tmp" \)
The parentheses force find to evaluate what's inside as a unit. The expression is true if "the access time is more than 5 days ago and \( either the name ends with .o or the name ends with .tmp \)." If you didn't use parentheses, the expression would mean something different:
-atime +5 -name "*.o" -o -name "*.tmp" Wrong!
When find sees two operators next to each other with no -o between,
that means AND.
So the "wrong" expression is true if "either \( the access time is more
than 5 days ago and the name ends with .o \) or the name ends with
.tmp."
This incorrect expression would be true for any name ending with
.tmp, no matter how recently the file was accessed - the
-atime
doesn't apply.
(There's nothing really "wrong" or illegal in this second
expression - except that it's not what we want.
find will accept the expression and do what we asked - it just won't
do what we want.)
The following command, which is what we want, lists files in the current directory and subdirectories that match our criteria:
%find . -atime +5 \( -name "*.o" -o -name "*.tmp" \) -print
What if we wanted to list all files that do not match these
criteria? All we want is the logical inverse of this expression. The
NOT operator is !
(exclamation point). The !
operator
applies to the expression on its right. Since we want it to
apply to the entire expression, and not just the -atime operator,
we'll have to group everything from -atime
to "*.tmp"
within another set of parentheses.
%find . ! \( -atime +5 \( -name "*.o" -o -name "*.tmp" \) \) -print
For that matter, even -print is an expression; it always evaluates to true. So are -exec and -ok (17.10); they evaluate to true when the command they execute returns a zero status. (There are a few situations in which this can be used to good effect; see article 17.11 for some of those.) Article 17.12 has more about find expressions.
But before you try anything too complicated, you need to realize one
thing.
find isn't as sophisticated as you might like it to be.
You can't squeeze all the spaces out of expressions, as if it were a
real programming language. You need spaces before and after operators
like !
, \(
, \)
, and {}
, in addition to
spaces before and after every other operator.
Therefore, a command line
like the following won't work:
%find . !\(-atime +5 \(-name "*.o" -o -name "*.tmp"\)\) -print
A true power user will realize that find is relying on the
shell to
separate the command line into meaningful chunks (8.5),
or tokens.
And the shell, in
turn, is assuming that tokens are separated by spaces. When
the shell gives find a chunk of characters like
*.tmp))
(without the double quotes or backslashes - the shell took them
away), find gets
confused; it thinks you're talking about a weird filename pattern that
includes a couple of parentheses.
Once you start thinking about expressions, find's syntax ceases to be obscure - in some ways, it's even elegant. It certainly allows you to say what you need to say with reasonable efficiency.
-
,