Exegesis 6
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.
A Parting of the ... err ... Parts
That works okay, but it's not perfect. In fact, as it's presented above the
&part subroutine is now both an ugly solution and an
inefficient one.
It's ugly because &part is now twice as long as it was
before. The two branches of control-flow within it are similar in form but
quite different in function. One partitions the data according to the
contents of a datum; the other, according to a datum's
position in @data.
It's inefficient because it effectively tests the type of the selector
argument twice: once (implicitly) when it's first bound to the
$is_sheep parameter, and then again (explicitly) in the call to
.isa.
It would be cleaner and more maintainable to break these two nearly unrelated behaviours out into separate subroutines. And it would be more efficient if we could select between those two subroutines by testing the type of the selector only once.
Of course, in Perl 6 we can do just that — with a multisub.
What's a multisub? It's a collection of related subroutines (known as "variants"), all of which have the same name but different parameter lists. When the multisub is called and passed a list of arguments, Perl 6 examines the types of the arguments, finds the variant with the same name and the most compatible parameter list, and calls that variant.
By the way, you might be more familiar with the term multimethod. A multisub is a multiply dispatched subroutine, in the same way that a multimethod is a multiply dispatched method. There'll be much more about those in Exegesis 12.
Multisubs provide facilities something akin to function overloading in C++. We set up several subroutines with the same logical name (because they implement the same logical action). But each takes a distinct set of argument types and does the appropriate things with those particular arguments.
However, multisubs are more "intelligent" that mere overloaded subroutines. With overloaded subroutines, the compiler examines the compile-time types of the subroutine's arguments and hard codes a call to the appropriate variant based on that information. With multisubs, the compiler takes no part in the variant selection process. Instead, the interpreter decides which variant to invoke at the time the call is actually made. It does that by examining the run-time type of each argument, making use of its inheritance relationships to resolve any ambiguities.
To see why a run-time decision is better, consider the following code:
class Lion is Cat {...} # Lion inherits from Cat
multi sub feed(Cat $c) { pat $c; my $glop = open 'Can'; spoon_out($glop); }
multi sub feed(Lion $l) { $l.stalk($prey) and kill; }
my Cat $fluffy = Lion.new;
feed($fluffy);
In Perl 6, the call to feed will correctly invoke the second
variant because the interpreter knows that $fluffy actually
contains a reference to a Lion object at the time the call is made
(even though the nominal type of the variable is Cat).
If Perl 6 multisubs worked like C++'s function overloading, the call to
feed($fluffy) would invoke the first version of
feed, because all that the compiler knows for sure at compile-time
is that $fluffy is declared to store Cat objects.
That's precisely why Perl 6 doesn't do it that way. We prefer leave the
hand-feeding of lions to other languages.
Many Parts
As the above example shows, in Perl 6, multisub variants are defined by
prepending the sub keyword with another keyword:
multi. The parameters that the interpreter is going to consider
when deciding which variant to call are specified to the left of a colon
(:), with any other parameters specified to the right. If there is
no colon in the parameter list (as above), all the parameters are
considered when deciding which variant to invoke.
We could re-factor the most recent version of &part like
so:
type Selector ::= Code | Class | Rule | Hash;
multi sub part (Selector $is_sheep:
Str +@labels is dim(2) = <<sheep goats>>,
*@data
) returns List of Pair
{
my ($sheep, $goats) is constant = @labels;
my %herd = ($sheep=>[], $goats=>[]);
for @data {
when $is_sheep { push %herd{$sheep}, $_ }
default { push %herd{$goats}, $_ }
}
return *%herd;
}
multi sub part (Int @sheep_indices:
Str +@labels is dim(2) = <<sheep goats>>,
*@data
) returns List of Pair
{
my ($sheep, $goats) is constant = @labels;
my %herd = ($sheep=>[], $goats=>[]);
for @data -> $index, $value {
if $index == any(@sheep_indices) { push %herd{$sheep}, $value }
else { push %herd{$goats}, $value }
}
return *%herd;
}
Here we create two variants of a single multisub named
&part. The first variant will be invoked whenever
&part is called with a Selector object as its
first argument (that is, when it is passed a Code or
Class or Rule or Hash object as its
selector).
The second variant will be invoked only if the first argument is an
Array of Int. If the first argument is anything else, an exception
will be thrown.
Notice how similar the body of the first variant is to the earlier
subroutine versions. Likewise, the body of the second variant is almost
identical to the if branch of the previous (subroutine)
version.
Notice too how the body of each variant only has to deal with the particular type of selector that its first parameter specifies. That's because the interpreter has already determined what type of thing the first argument was when deciding which variant to call. A particular variant will only ever be called if the first argument is compatible with that variant's first parameter.
Call Me Early
Suppose we wanted more control over the default labels that
&part uses for its return values. For example, suppose we
wanted to be able to prompt the user for the appropriate defaults —
before the program runs.
The default value for an optional parameter can be any valid Perl expression whose result is compatible with the type of the parameter. We could simply write:
my Str @def_labels;
BEGIN {
print "Enter 2 default labels: ";
@def_labels = split(/\s+/, <>, 3).[0..1];
}
sub part (Selector $is_sheep,
Str +@labels is dim(2) = @def_labels,
*@data
) returns List of Pair
{
# body as before
}
We first define an array variable:
my Str @def_labels;
This will ultimately serve as the expression that the @labels
parameter uses as its default:
Str +@labels is dim(2) = @def_labels
Then we merely need a BEGIN block (so that it runs before the
program starts) in which we prompt for the required information:
print "Enter 2 default labels: ";
read it in:
<>
split the input line into three pieces using whitespace as a separator:
split(/\s+/, <>, 3)
grab the first two of those pieces:
split(/\s+/, <>, 3).[0..1]
and assign them to @def_labels:
@def_labels = split(/\s+/, <>, 3).[0..1];
We're now guaranteed that @def_labels has the necessary default
labels before &part is ever called.
Core Breach
Built-ins like &split can also be given named arguments in
Perl 6, so, alternatively, we could write the BEGIN block like
so:
BEGIN {
print "Enter 2 default labels: ";
@def_labels = split(str=><>, max=>3).[0..1];
}
Here we're leaving out the split pattern entirely and making use of
&split's default split-on-whitespace behaviour.
Incidentally, an important goal of Perl 6 is to make the language powerful enough to natively implement all its own built-ins. We won't actually implement it that way, since screamingly fast performance is another goal, but we do want to make it easy for anyone to create their own versions of any Perl built-in or control structure.
So, for example, &split would be declared like this:
sub split( Rule|Str ?$sep = /\s+/,
Str ?$str = $CALLER::_,
Int ?$max = Inf
)
{
# implementation here
}
Note first that every one of &split's parameters is
optional, and that the defaults are the same as in Perl 5. If we omit the
separator pattern, the default separator is whitespace; if we omit the string
to be split, &split splits the caller's $_
variable; if we omit the "maximum number of pieces to return" argument, there
is no upper limit on the number of splits that may be made.
Note that we can't just declare the second parameter like so:
Str ?$str = $_,
That's because, in Perl 6, the $_ variable is lexical (not
global), so a subroutine doesn't have direct access to the $_ of
its caller. That means that Perl 6 needs a special way to access a caller's
$_.
That special way is via the CALLER:: namespace. Writing
$CALLER::_ gives us access to the $_ of whatever
scope called the current subroutine. This works for other variables too
($CALLER::foo, @CALLER::bar, etc.) but is rarely
useful, since we're only allowed to use CALLER:: to access
variables that already exist, and $_ is about the only variable
that a subroutine can rely upon to be present in any scope it might be called
from.

