open(FILE, "< myfile") || die "Can't open `myfile'\n"; while (<FILE>) { ... }However, you may not have caught on to everything that can go inside those angle brackets. They're not just for file handles anymore.
ARGV
. When used in
a loop like
while (<ARGV>) { ... }each element of the argument list,
@ARGV
, will be treated
as a filename. Perl will attempt to open each file in the list, read
the entire contents, and move on to the next file. You will get an
error message if a file cannot be opened, but the loop will continue
until all filenames are exhausted. If there are no arguments to the
program (i.e., if @ARGV
is empty), then the loop above
will get lines from the standard input, as any good UNIX program
should. By the way, since the
idiom is so common,
there is a shorthand notation, <>
, which means the same
thing.
Associated with the file handle ARGV
is the scalar
$ARGV
, which contains the name of the file currently
open. We can use this to write a simple-minded grep
program:
$pat = shift @ARGV; $many = @ARGV > 1; while (<>) { next unless /$pat/; print "$ARGV:" if $many; print; }The program uses
shift()
to remove the first element, the
pattern to search for, from the argument list. All other arguments are
treated as file names. The name of the current file is printed before
the matching lines if more than one file name is given on the command
line (just like the UNIX grep
program does).
The only rub with this whole
business is that as
each file is opened, the special variable $.
, which gives
the current line number that we are on, is not reset. If this is a
problem, employ the following trickery:
$oldname = ''; while (<>) { if ($ARGV ne $oldname) { $lineno = 0; $oldname = $ARGV; } $lineno++; ... }and use
$lineno
instead of the $.
variable. You may ask yourself why we are messing around with
$lineno
and not just doing the assignment to
$.
instead. The reason is that it simply does not work:
$.
is only reset on an explicit close()
.
<$file>
,
indicates an indirect file handle. Perl attempts to read the next line
from the file handle whose name is the string value of
$file
. For example:
open(FILE, "/etc/motd") || die "Can't open /etc/motd\n"; $file = 'FILE'; $line1 = <FILE>; $line2 = <$file>;Why is this at all useful? First it allows us to pass file handles to subroutines in a reasonable fashion:
open(FILE, "/etc/motd") || die "Can't open /etc/motd\n"; &mysub(FILE); sub mysub { local($file) = @_; while (<$file>) { ... } }Second, the string contained in the variable need not be a valid identifier. It could even, for example, be the name of the file that we are opening:
for (0..8) { $file = "/var/adm/messages.$_"; open($file, "$file") || die "Can't open $file\n"; }and then we could later do something like:
&do_something_with ("/var/adm/messages.0"); sub do_something_with { local($file) = @_; while (<$file>) { ... } }This can be a big win as far as readability goes, e.g., if you have lots of open file handles running around in your program.
Third, we can build up lists and arrays of indirect file handles:
@myfiles = 0..7; for (@myfiles) { open($myfiles[$_], "syslog.$_") || die "Can't open syslog.$_\n"; }Note that though we are able to use an array reference in the
open()
call in the above example, we can only use a
scalar variable inside angle brackets to denote an indirect file
handle. Thus, we must first dereference values from
@myfiles
before using them:
$file = $myfiles[3]; $line = <$file>; print "$line";
while (<*.c>) { print "Checking out $_...\n"; system("co -l $_"); }or you can slurp all the files into a list:
chmod 0644, <*.c>;However, don't think from the two examples above that the glob behaves just like a file handle, because it doesn't. This example
$file1 = <*.c>; print "$file1\n"; $file2 = <*.c>; print "$file2\n";prints the same filename twice (and spawns a subshell twice as well), rather than printing the first and second matching filenames. Read on, though, if you care to see some real tragedy.
One layer of variable interpolation will be done before the glob, but
you can't say <$glob>
because that's an indirect file
handle. You have to throw curly braces around your variable name to
force interpolation:
$glob = "*.c"; @c_files = <${glob}>;For those of you who weren't paying attention, we have just illuminated one part of the seamy underbelly of Perl: a place where
$glob
and ${glob}
do not mean the same
thing.
Since Perl does an exec()
to let the shell glob the
files, rather than relying on some built-in globbing function, it is
almost always more efficient (in terms of run-time, but perhaps not in
terms of readability or amount of code) to use the builtin directory
operators:
opendir(DIR, ".") || die "Can't open directory `.'\n"; @c_files = grep(/\.c$/, readdir(DIR)); closedir(DIR);Note that the glob will always return the file names in alphabetical order while the above code won't (although you're always free to sort() the list of files you get from the above method).
<>
idiom is a useful one and should be part of every
Perl programmer's toolkit. Indirect file handles and shell globs are
used less frequently but often to good effect in improving your code's
clarity and readability. Indirect file handles in particular can also
be used to needlessly obfuscate your code. So remember as you try to
cloud the minds of lesser mortals, that in this life one sometimes
needs to maintain one's own code.
Reproduced from ;login: Vol. 18 No. 5, October 1993.
Back to Table of Contents
11/22/96ah