Sign In/My Account | View Cart  
advertisement


Listen Print

Perl Design Patterns, Part 2
by Phil Crow | Pages: 1, 2, 3, 4

So, hashes are superior structures for simple to moderately complex data. To see how to build a hash structure consider an example: visualizing an outline. For simplicity, I'll represent the outline purely through indentation (not with Roman or other numerals). Here's an example outline:


    Grocery Store
        Milk
        Juice
        Butcher
            Thin sliced ham
            Chuck roast
        Cheese
    Cleaners
    Home Center
        Door
        Lock
        Shims

This outline describes a theoretical shopping trip. I want to represent it internally in my program so I can play with it. (One of my favorite games is turning outlines into pictures, see below.)

Instead of a full-blown object, I'll use a little hash-based data container for each node in the tree. Each node will keep track of three things:

  1. Name
  2. Level
  3. Children (a list of other nodes)

To keep track of who is a child of whom, I'll use a stack of these nodes. The node on the top of the stack is usually the parent of the next line of input. To show my method, I'll intersperse comments with the script. At the bottom of this section the script appears in one piece.


    #!/usr/bin/perl
    use strict; use warnings;

These lines are always a good idea.


    my $root = {
        name     => "ROOT",
        level    => -1,
        children => [],
    };

This is the root node. It's a hash reference containing the three keys mentioned earlier. The root node is special. Since it isn't in the file, I give it an artificial name and a level that is lower than anyone else's. (In a moment, we will see that levels in the input will be zero or positive.) Initially the list of children is empty.


    my @stack;
    push @stack, $root;

The stack will keep track of the ancestry of each new node. For starters it needs the root node, which won't ever be popped, because it is an ancestor of all the nodes.


    while (<>) {
        /^(\s*)(.*)/;
        my $indentation = length $1 if defined ($1);
        my $name        = $2;

To read the file, I chose a magic while. For each line there will be two parts: the indentation (the leading spaces) and the name (the rest of the line). The regular expression captures any leading space into $1 and everything else (except the new line) into $2. The length of the indentation is the important part, the bigger this is the more ancestors the node has. Lines starting at the margin have an indentation of 0 (which is why the ROOT has a level of -1).


        while ($indentation <= $stack[-1]{level}) {
            pop @stack;
        }

This loop handles ancestry. It pops the stack, until the node on top of the stack is the parent of the new node. Think of an example. When Home Center comes along, Cleaners and ROOT are on the stack. Home Center's level is 0 (it's at the margin), so is Cleaners'. Thus, Cleaners is popped (since 0 <= 0). Then only ROOT remains, so popping stops (0 is not <= -1).


        my $node = {
            name     => $name,
            level    => $indentation,
            children => [],
        };

This builds a new node for the current line. It's name and level are set. We haven't seen any children yet, but I make room for them in an empty list.


        push @{$stack[-1]{children}}, $node;

This line adds the new node to its parent's list of children. Remember that the parent is sitting on top of the stack. The top of the stack is $stack[-1] or the last element in the array.


        push @stack, $node;
    }

This pushes the new node onto the stack, in case it has children. The closing brace ends the magic while loop. For simplicity, I chose to display the output with Data::Dumper:


    use Data::Dumper; print Dumper($root);

Running this shows the tree (sideways) on standard out.

Here's the whole code without interruption:


    #!/usr/bin/perl
    use strict; use warnings;

    my $root = {
        name     => "ROOT",
        level    => -1,
        children => [],
    };

    my @stack;
    push @stack, $root;

    while (<>) {
        /^(\s*)(.*)/;
        my $indentation = length $1;
        my $name        = $2;
        while ($indentation <= $stack[-1]{level}) {
            pop @stack;
        }
        my $node = {
            name     => $name,
            level    => $indentation,
            children => [],
        };
        push @{$stack[-1]{children}}, $node;
        push @stack, $node;
    }

    use Data::Dumper; print Dumper($root);

I promised to explain how structures like the one above can be turned into pictures. The CPAN module UML::Sequence builds a structure similar to the one shown here. It then uses that to generate a UML Sequence diagram of the steps in SVG (Scalable Vector Graphics) format. That format can be converted with standard tools like Batik to PNG or JPEG. In practice the outlines which I turn into pictures represent call sequences for programs. Perl can even generate the outline by running the program. See UML::Sequence for more details.

When you have some interesting structured input, a builder might help make a good internal structure. One high value builder is XML::DOM. Another with a slightly different approach is XML::Twig. It is not coincidental that XML parsers are really builders, as XML files are non-binary trees.

Interpreter

If you haven't looked in GoF yet, start with the interpreter pattern. Laughter is good for the soul. The person who taught me patterns in Java did not even know why this pattern would not work in practice. He had heard it was somewhat slow, but he wasn't sure. Well I'm sure.

Luckily for us, Perl has alternatives. These range from quick and dirty to full blown. Here's the litany covered with examples below:

  • split
  • eval'ing Perl code
  • Config::Auto
  • Parse::RecDescent

Since we already have a language we like (that's Perl for those who haven't been paying attention), interpreting is limited to small languages that do something for us. Usually these turn out to be configuration files, so I will focus on those. (See the builder section above if a tree can represent your data file.)

Splitting

The easiest route involves split. Suppose I have a config file which uses variable=value settings. Comments and blanks should be ignored, all other lines should have a variable, value pair. That's easy:


    sub parse_config {
        my $file = shift;
        my %answer;

        open CONFIG, "$file" or die "Couldn't read config file $file: $!\n";
        while (<CONFIG>) {
            next if (/^#|^\s*$/);  # skip blanks and comments
            my ($variable, $value) = split /=/;
            $answer{$variable} = $value;
        }
        close CONFIG;

        return %answer;
    }

This subroutine expects a config file name. It opens and reads that file. Inside the magic while loop the regex rejects lines which start with '#' and those which contain only whitespace. All other lines are split on '='. The variables become keys in the %answer hash. When all the lines are read, the caller gets the hash back.

You could go much further along these lines, but see below for those who've gone before you (see especially Config::Auto).

Evaluating Perl Code

My current favorite way to bring configuration information into a Perl program is to specify the config file in Perl. So, I might have a config file like this:


    our $db_name = "projectdb";
    our $db_pass = "my_special_password_no_one_will_think_of";
    our %personal = (
        name    => "Phil Crow",
        address => "philcrow2000@yahoo.com",
    );

To use this in a Perl program all I have to do is eval it:


    ...
    open CONFIG, "config.txt" or die "couldn't...\n";
    my $config = join "", <CONFIG>;
    close CONFIG;

    eval $config;
    die "Couldn't eval your config: $@\n" if $@;
    ...

To read the file, I open it, then use join to put the angle read operator in list context. This lets me bring the whole file into a scalar. Once it's in (and the file is closed for tidiness), I just eval the string I read. I need to check $@ to make sure the file was good Perl. After that, I'm ready to use the values just as if they appeared in the program originally.

Pages: 1, 2, 3, 4

Next Pagearrow