start page | rating of books | rating of authors | reviews | copyrights

UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 9.17 Handling Lots of Text with Temporary Files Chapter 9
Saving Time on the Command Line
Next: 9.19 For the Impatient: Type-Ahead
 

9.18 Process Substitution

Do you find yourself making temporary files, then giving those files to some commands to read? For example, maybe you want to compare two files with comm (28.12)- but comm needs sorted files, and these files aren't sorted. So you have to type:

bash$ sort file1 > /tmp/file1.sort
bash$ sort file2 > /tmp/file2.sort
bash$ comm /tmp/file1.sort /tmp/file2.sort

There are easier ways to do that.

9.18.1 bash Process Substitution

bash has the operator <(process). It runs a process and gives the output to a named pipe. Then the filename of the named pipe becomes a command-line argument. Here's an example that shows two unsorted files and the result:

bash$ cat file1
rcsdiff.log
runsed
runsed.new
echo.where
foo
bash$ cat file2
newprogram
runsed
echo.where
foo

bash$ comm <(sort file1) <(sort file2)
                echo.where
                foo
        newprogram
rcsdiff.log
                runsed
runsed.new

(In the first column, comm shows lines only in file1. The second column shows lines only in file2. The third column shows lines that were in both files.)

Let's take a closer look at how that works. By setting the -x option (8.17), the shell will display the processes it runs with a + before each top-level process and ++ before second-level processes:

bash$ set -x
bash$ comm <(sort file1) <(sort file2)
+ comm /tmp/sh-np-a11167 /tmp/sh-np-b11167
++ sort file1
++ sort file2
                echo.where
                foo
        newprogram
rcsdiff.log
                runsed
runsed.new

The script made its named pipes in /tmp. bash ran each sort command, sent its output to a named pipe, and put the pipe's filename on the command line. When the comm program finished, the named pipes were deleted.

I've run into problems with this operator in some cases: when the process reading from a named pipe "hung" and never made any output. For example, that happened when I replaced comm with diff: I'd get no output from diff. I worked around the problem by closing the standard output of each process with the >&- operator (45.21), like this:

bash$ diff <(sort file1; exec >&-) <(sort file2; exec >&-)

That made diff happy; it showed me the differences between the two sorted files.

bash also has a similar operator, >( ), which takes the input of a process from a named pipe.

9.18.2 Automatic Temporary Files with !

If you don't have bash, you can use the shell script named ! (an exclamation point) [2] that runs a command, stores its output in a temporary file, then puts the temporary filename on its standard output. You use it with backquotes (9.16) (``). Here's how to write the example from the previous section:

[2] The C shell also uses an exclamation point as its history character (11.1, 11.15), but not if there's a space after the exclamation point. This script doesn't conflict with csh history. bash uses the exclamation point to reverse the exit status of a command - but then, if you're using bash, you don't need our ! script.

% comm `! sort file1` `! sort file2`
                echo.where
                foo
        newprogram
rcsdiff.log
                runsed
runsed.new

Why didn't I use the command line below, without the ! script?

% comm `sort file1` `sort file2`

That's because the comm program (like most UNIX programs) needs filename arguments. Using backquotes by themselves would place the list of names (the sorted contents of the files file1 and file2) on the comm command line.

To see what's happening, you can use a Bourne shell and set its -x option (8.17); the shell will display the commands it runs with a + before each one:

$ set -x
$ comm `! sort file1` `! sort file2`
+ ! sort file1 
+ ! sort file2 
+ comm /tmp/bang3969 /tmp/bang3971 
                echo.where
                foo
        newprogram
rcsdiff.log
                runsed
runsed.new

The script made its temporary files (21.3) in /tmp. You should probably remove them. If you're the only one using this script, you might be able to get away with a command like:

% rm /tmp/bang[1-9]*

If your system has more than one user, it's safer to use find (17.1):

% find /tmp -name 'bang*' -user myname -exec rm {} \;

If you use this script much, you might make that cleanup command into an alias (10.2) or a shell script - or start it in the background (1.26) from your .logout file (3.1, 3.2).

Here's the ! script. Of course, you can change the name to something besides ! if you want.










$@ 




#! /bin/sh

temp=/tmp/bang$$

case $# in
0)  echo "Usage: `basename $0` command [args]" 1>&2
    echo $temp
    exit 1
    ;;

*)  "$@" > $temp
    echo $temp
    ;;
esac

- JP


Previous: 9.17 Handling Lots of Text with Temporary Files UNIX Power ToolsNext: 9.19 For the Impatient: Type-Ahead
9.17 Handling Lots of Text with Temporary Files Book Index9.19 For the Impatient: Type-Ahead

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System