Sign In/My Account | View Cart  
advertisement


Listen Print

Exegesis 6
by Damian Conway | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.

A Parting of the ... err ... Parts

That works okay, but it's not perfect. In fact, as it's presented above the &part subroutine is now both an ugly solution and an inefficient one.

It's ugly because &part is now twice as long as it was before. The two branches of control-flow within it are similar in form but quite different in function. One partitions the data according to the contents of a datum; the other, according to a datum's position in @data.

It's inefficient because it effectively tests the type of the selector argument twice: once (implicitly) when it's first bound to the $is_sheep parameter, and then again (explicitly) in the call to .isa.

It would be cleaner and more maintainable to break these two nearly unrelated behaviours out into separate subroutines. And it would be more efficient if we could select between those two subroutines by testing the type of the selector only once.

Of course, in Perl 6 we can do just that — with a multisub.

What's a multisub? It's a collection of related subroutines (known as "variants"), all of which have the same name but different parameter lists. When the multisub is called and passed a list of arguments, Perl 6 examines the types of the arguments, finds the variant with the same name and the most compatible parameter list, and calls that variant.

By the way, you might be more familiar with the term multimethod. A multisub is a multiply dispatched subroutine, in the same way that a multimethod is a multiply dispatched method. There'll be much more about those in Exegesis 12.

Multisubs provide facilities something akin to function overloading in C++. We set up several subroutines with the same logical name (because they implement the same logical action). But each takes a distinct set of argument types and does the appropriate things with those particular arguments.

However, multisubs are more "intelligent" that mere overloaded subroutines. With overloaded subroutines, the compiler examines the compile-time types of the subroutine's arguments and hard codes a call to the appropriate variant based on that information. With multisubs, the compiler takes no part in the variant selection process. Instead, the interpreter decides which variant to invoke at the time the call is actually made. It does that by examining the run-time type of each argument, making use of its inheritance relationships to resolve any ambiguities.

To see why a run-time decision is better, consider the following code:

class Lion is Cat {...}    # Lion inherits from Cat

multi sub feed(Cat  $c) { pat $c; my $glop = open 'Can'; spoon_out($glop); }
multi sub feed(Lion $l) { $l.stalk($prey) and kill; }

my Cat $fluffy = Lion.new;

feed($fluffy);

In Perl 6, the call to feed will correctly invoke the second variant because the interpreter knows that $fluffy actually contains a reference to a Lion object at the time the call is made (even though the nominal type of the variable is Cat).

If Perl 6 multisubs worked like C++'s function overloading, the call to feed($fluffy) would invoke the first version of feed, because all that the compiler knows for sure at compile-time is that $fluffy is declared to store Cat objects. That's precisely why Perl 6 doesn't do it that way. We prefer leave the hand-feeding of lions to other languages.

Many Parts

As the above example shows, in Perl 6, multisub variants are defined by prepending the sub keyword with another keyword: multi. The parameters that the interpreter is going to consider when deciding which variant to call are specified to the left of a colon (:), with any other parameters specified to the right. If there is no colon in the parameter list (as above), all the parameters are considered when deciding which variant to invoke.

We could re-factor the most recent version of &part like so:

type Selector ::= Code | Class | Rule | Hash;

multi sub part (Selector $is_sheep:
                Str +@labels is dim(2) = <<sheep goats>>,
                *@data
               ) returns List of Pair
{
    my ($sheep, $goats) is constant = @labels;
    my %herd = ($sheep=>[], $goats=>[]);
    for @data {
        when $is_sheep { push %herd{$sheep}, $_ }
        default        { push %herd{$goats}, $_ }
    }
    return *%herd;
}

multi sub part (Int @sheep_indices:
                Str +@labels is dim(2) = <<sheep goats>>,
                *@data
               ) returns List of Pair
{
    my ($sheep, $goats) is constant = @labels;
    my %herd = ($sheep=>[], $goats=>[]);
    for @data -> $index, $value {
        if $index == any(@sheep_indices) { push %herd{$sheep}, $value }
        else                             { push %herd{$goats}, $value }
    }
    return *%herd;
}

Here we create two variants of a single multisub named &part. The first variant will be invoked whenever &part is called with a Selector object as its first argument (that is, when it is passed a Code or Class or Rule or Hash object as its selector).

The second variant will be invoked only if the first argument is an Array of Int. If the first argument is anything else, an exception will be thrown.

Notice how similar the body of the first variant is to the earlier subroutine versions. Likewise, the body of the second variant is almost identical to the if branch of the previous (subroutine) version.

Notice too how the body of each variant only has to deal with the particular type of selector that its first parameter specifies. That's because the interpreter has already determined what type of thing the first argument was when deciding which variant to call. A particular variant will only ever be called if the first argument is compatible with that variant's first parameter.

Call Me Early

Suppose we wanted more control over the default labels that &part uses for its return values. For example, suppose we wanted to be able to prompt the user for the appropriate defaults — before the program runs.

The default value for an optional parameter can be any valid Perl expression whose result is compatible with the type of the parameter. We could simply write:

my Str @def_labels;

BEGIN {
    print "Enter 2 default labels: ";
    @def_labels = split(/\s+/, <>, 3).[0..1];
}

sub part (Selector $is_sheep,
          Str +@labels is dim(2) = @def_labels,
          *@data
         ) returns List of Pair
{
    # body as before
}

We first define an array variable:

my Str @def_labels;

This will ultimately serve as the expression that the @labels parameter uses as its default:

Str +@labels is dim(2) = @def_labels

Then we merely need a BEGIN block (so that it runs before the program starts) in which we prompt for the required information:

print "Enter 2 default labels: ";

read it in:

<>

split the input line into three pieces using whitespace as a separator:

split(/\s+/, <>, 3)

grab the first two of those pieces:

split(/\s+/, <>, 3).[0..1]

and assign them to @def_labels:

@def_labels = split(/\s+/, <>, 3).[0..1];

We're now guaranteed that @def_labels has the necessary default labels before &part is ever called.

Core Breach

Built-ins like &split can also be given named arguments in Perl 6, so, alternatively, we could write the BEGIN block like so:

BEGIN {
    print "Enter 2 default labels: ";
    @def_labels = split(str=><>, max=>3).[0..1];
}

Here we're leaving out the split pattern entirely and making use of &split's default split-on-whitespace behaviour.

Incidentally, an important goal of Perl 6 is to make the language powerful enough to natively implement all its own built-ins. We won't actually implement it that way, since screamingly fast performance is another goal, but we do want to make it easy for anyone to create their own versions of any Perl built-in or control structure.

So, for example, &split would be declared like this:

sub split( Rule|Str ?$sep = /\s+/,
           Str ?$str = $CALLER::_,
           Int ?$max = Inf
          )
{
    # implementation here
}

Note first that every one of &split's parameters is optional, and that the defaults are the same as in Perl 5. If we omit the separator pattern, the default separator is whitespace; if we omit the string to be split, &split splits the caller's $_ variable; if we omit the "maximum number of pieces to return" argument, there is no upper limit on the number of splits that may be made.

Note that we can't just declare the second parameter like so:

Str ?$str = $_,

That's because, in Perl 6, the $_ variable is lexical (not global), so a subroutine doesn't have direct access to the $_ of its caller. That means that Perl 6 needs a special way to access a caller's $_.

That special way is via the CALLER:: namespace. Writing $CALLER::_ gives us access to the $_ of whatever scope called the current subroutine. This works for other variables too ($CALLER::foo, @CALLER::bar, etc.) but is rarely useful, since we're only allowed to use CALLER:: to access variables that already exist, and $_ is about the only variable that a subroutine can rely upon to be present in any scope it might be called from.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Next Pagearrow