openFILEHANDLE
,EXPR
openFILEHANDLE
This function opens the file whose filename is given by
EXPR
, and associates it with
FILEHANDLE
. If
EXPR
is omitted, the scalar variable of the same name as the
FILEHANDLE
must contain the filename. (And you must also be careful to use "
or die
" after the statement rather than "
|| die
", because the precedence of
||
is higher than list operators like
open
.)
FILEHANDLE
may be a directly specified filehandle name, or an expression whose value will be used for the filehandle. The latter is called an indirect filehandle. If you supply an undefined variable for the indirect filehandle, Perl will not automatically fill it in for you - you have to make sure the expression returns something unique, either a string specifying the actual filehandle name, or a filehandle object from one of the object-oriented I/O packages. (A filehandle object is unique because you call a constructor to generate the object. See the example later in this section.)
After the filehandle is determined, the filename string is processed. First, any leading and trailing whitespace is removed from the string. Then the string is examined on both ends for characters specifying how the file is to be opened. (By an amazing coincidence, these characters look just like the characters you'd use to indicate I/O redirection to the Bourne shell.) If the filename begins with
<
or nothing, the file is opened for input. If the filename begins with
>
, the file is truncated and opened for output. If the filename begins with
>>
, the file is opened for appending.
(You can also put a
+
in front of the
>
or
<
to indicate that you want both read and write access to the file.) If the filename begins with
|
, the filename is interpreted as a command to which output is to be piped, and if the filename ends with a
|
, the filename is interpreted as command which pipes input to us.
You may not have an
open
command that pipes both in and out, although the IPC::Open2 and IPC::Open3 library routines give you a close equivalent. See the section "Bidirectional Communication" in
Chapter 6
.
Any pipe command containing shell metacharacters is passed to
/bin/sh
for execution; otherwise it is executed directly by Perl. The filename "
-
" refers to
STDIN
, and "
>-
" refers to
STDOUT
.
open
returns non-zero upon success, the undefined value otherwise. If the
open
involved a pipe, the return value happens to be the process ID of the subprocess.
If you're unfortunate enough to be running Perl on a system that distinguishes between text files and binary files (modern operating systems don't care), then you should check out
binmode
for tips
for dealing with this. The key distinction between systems that need
binmode
and those that don't is their text file formats. Systems like UNIX and Plan9 that delimit lines with a single character, and that encode that character in C as
'\n'
, do not need
binmode
. The rest need it.
Here is some code that shows the relatedness of a filehandle and a variable of the same name:
$ARTICLE = "/usr/spool/news/comp/lang/perl/misc/38245"; open ARTICLE or die "Can't find article $ARTICLE: $!\n"; while (<ARTICLE>) {...
Append to a file like this:
open LOG, '>>/usr/spool/news/twitlog'; # (`log' is reserved)
Pipe your data from a process:
open ARTICLE, "caesar <$article |"; # decrypt article with rot13
Here
<
does not indicate that Perl should open the file for input, because
<
is not the first character of
EXPR
. Rather, the concluding
|
indicates that input is to be piped from
caesar <$article
(from the program
caesar
, which takes
$article
as its standard input). The
<
is interpreted by the subshell that Perl uses to start the pipe, because
<
is a shell metacharacter.
Or pipe your data to a process:
open EXTRACT, "|sort >/tmp/Tmp$$"; # $$ is our process number
In this next example we show one way to do recursive opens, via indirect filehandles. The files will be opened on filehandles
fh01
,
fh02
,
fh03
, and so on. Because
$input
is a local variable, it is preserved through recursion, allowing us to close the correct file before we return.
# Process argument list of files along with any includes. foreach $file (@ARGV) { process($file, 'fh00'); } sub process { local($filename, $input) = @_; $input++; # this is a string increment unless (open $input, $filename) { print STDERR "Can't open $filename: $!\n"; return; } while (<$input>) { # note the use of indirection if (/^#include "(.*)"/) { process($1, $input); next; } ... # whatever } close $input; }
You may also, in the Bourne shell tradition, specify an
EXPR
beginning with
>&
, in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) which is to be duped and opened.[
6
] You may use
&
after
>
,
>>
,
<
,
+>
,
+>>
, and
+<
. The mode you specify should match the mode of the original filehandle. Here is a script that saves, redirects, and restores
STDOUT
and
STDERR
:
[6] The word "dup" is UNIX-speak for "duplicate". We're not really trying to dupe you. Trust us.
#!/usr/bin/perl open SAVEOUT, ">&STDOUT"; open SAVEERR, ">&STDERR"; open STDOUT, ">foo.out" or die "Can't redirect stdout"; open STDERR, ">&STDOUT" or die "Can't dup stdout"; select STDERR; $| = 1; # make unbuffered select STDOUT; $| = 1; # make unbuffered print STDOUT "stdout 1\n"; # this propagates to print STDERR "stderr 1\n"; # subprocesses too close STDOUT; close STDERR; open STDOUT, ">&SAVEOUT"; open STDERR, ">&SAVEERR"; print STDOUT "stdout 2\n"; print STDERR "stderr 2\n";
If you specify
<&=
N
, where
N
is a number, then Perl will do an equivalent of C's
fdopen
(3) of that file descriptor; this is more parsimonious with file descriptors than the dup form described earlier. (On the other hand, it's more dangerous, since two filehandles may now be sharing the same file descriptor, and a close on one filehandle may prematurely close the other.) For example:
open
FILEHANDLE
, "<&=$fd";
If you open a pipe to or from the command "
-
" (that is, either
|-
or
-|
), then an implicit fork is done, and the return value of
open
is the pid of the child within the parent process, and
0
within the child process. (Use
defined($pid)
in either the parent or child to determine whether the
open
was successful.) The filehandle behaves normally for the parent, but input and output to that filehandle is piped from or to the
STDOUT
or
STDIN
of the child process. In the child process the filehandle isn't opened - I/O happens from or to the new
STDIN
or
STDOUT
. Typically this is used like the normal piped
open
when you want to exercise more control over just how the pipe command gets executed, such as when you are running setuid, and don't want to have to scan shell commands for metacharacters. The following pairs are equivalent:
open FOO, "|tr '[a-z]' '[A-Z]'"; open FOO, "|-" or exec 'tr', '[a-z]', '[A-Z]'; open FOO, "cat -n file|"; open FOO, "-|" or exec 'cat', '-n', 'file';
Explicitly closing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in $? . On any operation which may do a fork, unflushed buffers remain unflushed in both processes, which means you may need to set $| on one or more filehandles to avoid duplicate output (and then do output to flush them).
Filehandles
STDIN
,
STDOUT
, and
STDERR
remain open following an exec. Other filehandles do not. (However, on systems supporting the
fcntl
function, you may modify the close-on-exec flag for a filehandle. See
fcntl
earlier in this chapter. See also the special
$^F
variable.)
Using the constructor from the FileHandle module, described in Chapter 7 , you can generate anonymous filehandles which have the scope of whatever variables hold references to them, and automatically close whenever and however you leave that scope:
use FileHandle; ... sub read_myfile_munged { my $ALL = shift; my $handle = new FileHandle; open $handle, "myfile" or die "myfile: $!"; $first = <$handle> or return (); # Automatically closed here. mung $first or die "mung failed"; # Or here. return $first, <$handle> if $ALL; # Or here. $first; # Or here. }
In order to open a file with arbitrary weird characters in it, it's necessary to protect any leading and trailing whitespace, like this:
$file =~ s#^(\s)#./$1#; open (FOO, "< $file\0");
But we've never actually seen anyone use that in a script...
If you want a real C open (2), then you should use the sysopen function. This is another way to protect your filenames from interpretation. For example:
use FileHandle; sysopen HANDLE, $path, O_RDWR|O_CREAT|O_EXCL, 0700 or die "sysopen $path: $!"; HANDLE->autoflush(1); HANDLE->print("stuff $$\n"); seek HANDLE, 0, 0; print "File contains: ", <HANDLE>;
See seek for some details about mixing reading and writing.