Perl Design Patterns, Part 2
by Phil CrowAugust 07, 2003
This is the second in a series of articles which form one Perl programmer's response to the book, Design Patterns (also known as the Gang of Four book or simply as GoF, because four authors wrote it).
As I showed in
the first
article, Perl provides the best patterns in its core and many
others are modules which ship with Perl or are available from CPAN.
There I considered Iterator (foreach), Decorator (pipes and list filters),
Flyweight (Memoize.pm), and Singleton (bless an object in a BEGIN block).
People into patterns often talk about how knowing patterns makes describing designs easier. The parenthetical comments in the last sentence show how Perl takes this to new heights by including the patterns internally.
This article continues my treatment by considering patterns which rely on data containers and/or code references (which are also called callbacks). Before showing the patterns let me explain these terms.
|
Related Reading Learning Perl Objects, References, and Modules |
Data Containers
I use data containers to mean any reference that holds a data structure. Arrays and hashes are common data containers, but hashes of lists of hashes storing things are more interesting. Careful use of these structure containers can often eliminate the need for objects.
Here's a concrete example. Suppose I want a phone list. I might use a container like this:
my $phone_list = {
'phil' => [
{ type => 'home', number => '555-0001' },
{ type => 'pager', number => '555-1000' },
],
'frank' => [
{ type => 'cell', number => '555-9012' },
{ type => 'pager', number => '555-5678' },
{ type => 'home', number => '555-1234' },
],
};
This container is housed in a hash. Its keys are names; its values are phone numbers. The numbers are listed in the order the person would like them used. (Call Frank on his cell phone first, then try his pager. If all else fails, use his home phone.)
To use this structure I might do the following:
#!/usr/bin/perl
use strict; use warnings;
my $phone_list = {
'Phil' => [
{ type => 'home', number => '555-0001' },
{ type => 'pager', number => '555-1000' },
],
'Frank' => [
{ type => 'cell', number => '555-9012' },
{ type => 'pager', number => '555-5678' },
{ type => 'home', number => '555-1234' },
],
};
my $person = shift or die "usage: $0 person\n";
foreach my $number (@{$phone_list->{$person}}) {
print "$number->{type} $number->{number}\n";
}
Here, the user supplies the name of the person he or she wants to reach
as a command line argument which I store as $person. I then loop through
all the phone numbers for that person, printing the type and number.
Of course, in practice your data would live outside your script. The example just shows what one data container can hold.
If you need to use a structure made of data nodes, you can often avoid the need for a node object by using a data container instead. Object Oriented programming proponents would probably want me to make an object for each person. In that object they might even want me to store an object for each phone type in some cumbersome list container. My advice: don't give in to pedants. Even in Java, I can build a structure like the one above (though not as easily). Doing so is often wise. Objects work better in more complex situations.
What's a Code Reference?
A code reference is like any reference in Perl, but what it points to is a subroutine you can call. For instance, I could write:
my $doubler = sub { return 2 * $_[0]; };
Then later in my program I would call that routine as:
my $doubled = &$doubler(5); # $doubled is now 10
This example is contrived. But it lets you see the basic syntax of code
references. If you assign a sub to a variable, you receive a code reference
by the grace of Perl. To call the sub stored in the reference put an
& in front of the variable which stores it. This is like we do for other
references, as in this standard hash walker:
foreach my $key (keys %$hash_reference) { ... }
The & is the sigil (or funny character) for subroutines, just like @
and % are the sigils for arrays and hashes.
Many patterns in GoF and outside it can be implemented well in Perl with code references. Languages which don't provide code references are missing an important type.
Having explained these tools, I'm ready to show you some patterns which use them.
Strategy
When you want to select from a series of choices for how something should be done, you need a strategy scheme. For example, you might want to sort based on a comparison function. Each time you sort, you should be able to specify the order strategy.
Since Perl has code references, we can easily implement the strategy pattern without bloating our code base with a proliferation of classes whose sole purpose is to provide one function.
Here's an example with the built-in sort:
sort { lc($a) cmp lc($b) } @items
This sorts without regard to case. Notice how sort is receiving the
function directly in the call. Though we could do this for our own
functions, it is more common to take a reference to the function as
a required positional parameter.
Suppose, for example, that we want to list all files in the current directory, or any of its subdirectories, with some property. There are two pieces to this task: (1) Scan down the directory tree for all the entries, and (2) Test each file to see if it meets the criterion. Ideally we would like to separate these tasks so we can reuse them independently (for instance scanning a directory tree is more common than any particular criterion). We will make the criterion a strategy executed by the directory scanner.
#!/usr/bin/perl
use strict; use warnings;
my @files = find_files(\&is_hidden, ".");
local $" = "\n";
print "@files\n";
sub is_hidden {
my $file = shift;
$file =~ s!.*/!!;
return 0 if ($file =~ /^\./);
return 1;
}
sub find_files {
my $callback = shift;
my $path = shift;
my @retval;
push @retval, $path if &$callback($path);
return @retval unless (-d $path);
# identify children
opendir DIR, $path or return;
my @files = readdir DIR;
closedir DIR;
# visit each child
foreach my $file (@files) {
next if ($file =~ /^\.\.?$/); # skip . and ..
push @retval, find_files("$path/$file", $callback);
}
return @retval;
}
To understand this example, start with the initial call to find_files.
It passes two arguments. The first is a code reference. Note the syntax.
As I pointed out in the introduction, to let Perl know I mean a subroutine,
I put the & sigil in front of is_hidden. To make a reference to that
routine (instead of calling it immediately), I put a backslash in
front, just as I would to take any other kind of reference.
When I use the callback in find_files, $callback has the reference to
the code. To dereference it I put the & sigil in front of it.
The find_files subroutine takes a path where the search begins and a
code reference called $callback. At each invocation, it stores the path
in the return list, if callback returns true for that path. This allows
you to reuse find_files for many applications, changing only the callback
subroutine to change the outcome. This is the strategy pattern, but
without the hassle of subclassing the find_files abstract base class
and overriding the criterion method.
In find_files, I use recursion to descend the directory tree and its
subtrees. First, I call the callback to see if the current path should
go into the output. Then the real routine begins. What the callback
does makes no difference to this routine. Any true or false value is OK
with find_files.
The recursion stops if the file is not a directory. At that point the
list is immediately returned. (It could be empty or have the current path
in it, depending on the callback's return value.) Otherwise, all the files and
subdirectories in the current path are read into @files. Each of those
entries is scanned by the recursive call to find_files (unless the file
is . or .., which would create endless recursion). Whatever the recursive
call to find_files returns, it is pushed onto the end of the final output.
When all children have been visited, @result is returned to the caller.
The CPAN module File::Find robustly solves the problem approached quickly
in my example above. It relies on exactly this kind of function callback.
The Strategy Pattern uses a callback to perform a single task that varies from use to use. The next pattern uses a series of callbacks to implement the steps of an algorithm.


