After a brief hiatus, welcome back to the second year of Perl Practicum. This month, Rob asked me to write a little piece about writing readable Perl. I started out with every intention of doing just that, but the article evolved instead into a piece on writing maintainable Perl. This dismayed me for a bit, until I realized that maintainability was the primary driving force for clarity. If, having been written, one's code never had to be looked at again, the motivation for writing clear code would be greatly reduced (note: not eradicated). That bit of philosophy done with, let us proceed with what amounts to little more than a collection of useful tips I have collected through the years.
grep($array{$_}++, @list); # BAD! for (@list) { # OK $array{$_} = 1; }I claim the second form is to be preferred since it makes clear what is going on: we are iterating over
@list
and setting
values in %array
to be non-zero (the assignment, as
opposed to the auto-increment operator, is significant). There are
several other, more verbose, ways to rewrite the above "for" loop:
decide which form you are most comfortable with, but avoid
grep()
as a loop operator. I generally prefer to optimize
for readability over performance.
As another example, consider the following two infinite loop constructs:
while (1) { # BAD! ... } for ( ;; ){ # OK ... }The second form tends to be more visually arresting, it alerts the reader that something important is happening.
As long as we are on the subject of loops, let us examine another rule for clear communication, "say it succinctly." The goto statement and multi-level break commands are both to be abhorred because they hamper the reader's ability to conceptualize the program flow at a glance. I went looking for some "before and after" examples of these constructs and did not find any that would easily fit the space boundaries imposed upon this series. Enough said, I think.
until
instead of while
and
unless
instead of if
:
&usage() unless (@ARGV); until ($value > $LIMIT) { ... }Avoiding extra negation in conditional expressions can be a great aid to clarity. Perl can read like clear prose if you are careful and use informative symbolic names.
With the postfix conditional operators, be careful to put the most important part of the statement up front. This is why we write:
open(...) || die ... ; # recommendedrather than
die ... unless open(...); # EVIL!The purpose of the statement is to associate a file handle with a file or process. The
die()
operation is merely a case of
exception handling.
Similarly, avoid overloading conditional expressions with operations which actually manipulate program data or have other side effects. Evaluate an expression to take a logical branch in the program flow and then perform your operations.
print (1+2)*3, "\n"; # INCORRECT!This prints the value expression in parentheses, i.e., the number three without a newline. The statement is syntactically correct (points to you if you figure out exactly what happens in the rest of the line) and the Perl interpreter will not complain, but the output is wildly different from:
print((1+2)*3, "\n"); # CORRECTwhich is probably what the author of the code intended.
If you only need a few scattered values out of a list value returned by a function, please avoid assignment to dummy variables. In other words, do:
($login, $name, $home) = (getpwent)[0,6,7]; # GOODrather than:
($login, $dummy, $dummy $dummy, $dummy, $dummy, $name, $home) = getpwent; # BADAside from wasted typing, the second form obscures precisely which information you are interested in manipulating.
Function defaults are a trickier issue. You can pretty well assume
that any Perl function will operate on $_
or @_
when given no
arguments. This is a nice feature and I use it all the time (too
convenient to give up, I suppose). It does, however, make Perl code
less than clear to the uninitiated reader, and I have had occasions
where something unexpected has cropped up because $_
did not contain
what I thought it did. On a more trivial issue, I would like to make a
plea for explicitly using the "<" character when opening a file for
reading, even though this is the default behavior for open()
.
Never hard-code pathnames or other constants into your program. Assign
these values to variables AT THE TOP of your program. For example,
here are the first few lines of an application I wrote to manipulate a
remote optical jukebox:
#!/usr/bin/perl $jukehost = `gator'; $nfsjukedir = `/rd/juke'; $realjukedir = `/export/jb/jb0'; $localjukedir = `/jukebox'; $remotecmd = `/usr/local/etc/jbadm';When the code is written in this fashion, maintenance becomes a breeze.
Always explicitly close file and directory handles as soon as you finish processing the data. You avoid potential shortage problems, protect your code from interesting side effects caused by later modification, and make your code clearer to the hypothetical external viewer.
To avoid this morass (for example, everybody I know hates my bracing style), I suggest only one simple rule. Pick a site standard that everybody can live with and stick to it. Even a bad standard is better than no standard at all. If you are forced to maintain code that is developed and used externally to your organization, then maintain whatever conventions pertain to the code as you received it.
For a good starting point, there is a document available on the Internet entitled Recommended C Style and Coding Standards (originally from a document prepared by committee at Bell Labs, but modified by Henry Spencer, David Keppel, and Mark Brader). Obtain /pub/cstyle.tar.Z from ftp.cs.washington.edu.
Reproduced from ;login: Vol. 19 No. 6, December 1994.
Back to Table of Contents
11/27/96ah