start page | rating of books | rating of authors | reviews | copyrights

Unix Power ToolsUnix Power ToolsSearch this book

38.10. Using Wildcards with tar

When extracting files from a tar archive, it's handy to be able to use wildcards. You have to protect them (Section 27.12) from the shell, so that they are passed directly to tar.

38.10.1. Without GNU tar

In general, tar can't do wildcard matching on the filenames within an archive. There's a terribly ugly hack that you can use to select the files you want anyway. Try a command like this:

'...' Section 28.14

% tar xvf /dev/rst0 `tar tf /dev/rst0 | egrep  'lib/(foo|bar)'`

What you're doing here is using tar twice. tar t will print the names of all the files on the tape. The pattern supplied to egrep (Section 13.4) selects the pathnames containg lib/foo or lib/bar, and the resulting filenames are passed to the first tar command, which actually extracts the files from the archive. Note that these patterns are regular expressions, not wildcards (Section 32.2).

Here's another subtle but important point. Because the regular expression patterns in the example above are not anchored with ^ or $ characters (Section 32.4), they can match anywhere in the file pathnames. So lib/(foo|bar) would match a pathname like lib/foo as well as a pathname like /usr/lib/glib/foo.h.

38.10.2. With GNU tar

Figure Go to http://examples.oreilly.com/upt3 for more information on: GNU tar

One of the many improvements in GNU tar is that it understands wildcards in the names of files within an archive. (Remember that because you want tar, not the shell, to see these wildcards, you have to quote (Section 27.12) the filename arguments that have wildcards.)

Unlike the examples in the previous section, GNU tar uses wildcards, not regular expressions (Section 32.2). Unlike shells, the wildcard expressions in GNU tar can match across slashes (/) in pathnames.

Here's a demonstration of using wildcards: we want to extract all subdirectories named editor. Command 1 shows how you'd do it in non-GNU tar: list the exact pathnames of the subdirectories. Notice that tar extracts the directory and any subdirectories too. Command 2 shows the easy way to do the job with GNU tar: make a wildcard expression that ends with a slash and the directory name. As before, tar extracts the directory and any subdirectories. What if you want to extract anything with the string editor in its name -- including individual files? Make a wildcard pattern without the slash and a filename surrounded by stars, as in command 3. Finally, command 4 shows an example of how (different than in shells) a wildcard can match across the / characters in pathnames. Command 4 extracts only directories named editor somewhere (possibly several layers) underneath a directory named skin:

> Section 28.12

1$ tar xvf mozilla.tar package/chrome/en-US/locale/en-US/editor \
> package/chrome/classic/skin/classic/content/editor \
> ...
package/chrome/en-US/locale/en-US/editor/
package/chrome/en-US/locale/en-US/editor/contents.rdf
package/chrome/en-US/locale/en-US/editor/editor.dtd
  ...
package/chrome/classic/skin/classic/editor/
package/chrome/classic/skin/classic/editor/EditModeTabs.css
  ...
package/chrome/classic/skin/classic/editor/images/
package/chrome/classic/skin/classic/editor/images/Map_Copy.gif
  ...
2$ tar xvf mozilla.tar '*/editor'
package/chrome/en-US/locale/en-US/editor/
package/chrome/en-US/locale/en-US/editor/contents.rdf
package/chrome/en-US/locale/en-US/editor/editor.dtd
  ...
package/chrome/classic/skin/classic/editor/
package/chrome/classic/skin/classic/editor/EditModeTabs.css
  ...
package/chrome/classic/skin/classic/editor/images/
package/chrome/classic/skin/classic/editor/images/Map_Copy.gif
  ...
3$ tar xvf mozilla.tar '*editor*'
package/defaults/pref/editor.js
package/components/editor.xpt
  ...
package/chrome/en-US/locale/en-US/editor/
package/chrome/en-US/locale/en-US/editor/contents.rdf
package/chrome/en-US/locale/en-US/editor/editor.dtd
  ...
package/chrome/comm/content/communicator/editorBindings.xul
package/chrome/comm/content/communicator/search/search-editor.js
  ...
4$ tar xvf mozilla.tar '*/skin/*/editor'
package/chrome/classic/skin/classic/editor/
package/chrome/classic/skin/classic/editor/EditModeTabs.css
  ...
package/chrome/classic/skin/classic/editor/images/
package/chrome/classic/skin/classic/editor/images/Map_Copy.gif
  ...
package/chrome/blue/skin/blue/editor/
package/chrome/blue/skin/blue/editor/contents.rdf
  ...
package/chrome/blue/skin/blue/editor/images/
package/chrome/blue/skin/blue/editor/images/Map_Copy.gif
  ...
package/chrome/modern/skin/modern/editor/
package/chrome/modern/skin/modern/editor/contents.rdf
  ...

There's more about wildcard matching in the GNU tar info page (but not its manpage).

38.10.3. Wildcard Gotchas in GNU tar

One subtle (but sometimes important!) difference between GNU tar and other versions comes when you're trying to extract a file whose name contains a wildcard character. You'll probably have to type a backslash (\) before that name. Also, because the shell may also try to interpret that wildcard character -- or at least the backslash -- you also may need to quote the backslashed expression! (Section 27.18 shows a situation like this one, with multiple layers of backslash interpretation.)

Here's an example. We're archiving the system binary file named [ (or test) (Section 35.26):

5$ which [
/usr/bin/[
6$ cd /tmp
7$ tar cvf tartest.tar /usr/bin/[
tar: Removing leading `/' from member names
usr/bin/[
8$ tar xvf tartest.tar usr/bin/[
tar: usr/bin/[: Not found in archive
tar: Error exit delayed from previous errors
9$ tar xvf tartest.tar usr/bin/\[
tar: usr/bin/[: Not found in archive
tar: Error exit delayed from previous errors
10$ tar xvf tartest.tar 'usr/bin/\['
usr/bin/[

Storing the file in the archive, in command 7, is simple. The shell doesn't have anything to expand; tar simply stores the name as-is because wildcards make sense only when tar is extracting from an archive. In command 8, though, when we try to extract the file into the current directory, tar says it isn't there -- because it's now treating [ as a wildcard pattern. Adding a single backslash in command 9 doesn't help because the shell strips it off before tar sees it. We need to put quotes around the backslash, as in command 10, to make tar match the actual name.

--JP and TOR



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.