Exegesis 6
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.
Refactoring Parameter Lists
By this stage, you might be justified in feeling that
&part's parameter list is getting just a leeeeettle too
sophisticated for its own good. Moreover, if we were using the multisub
version, that complexity would have to be repeated in every variant.
Philosophically though, that's okay. The later versions of
&part are doing some fairly sophisticated things, and the
complexity required to achieve that has to go somewhere. Putting that extra
complexity in the parameter list means that the body of &part
stays much simpler, as do any calls to &part.
That's the whole point: Complexify locally to simplify globally. Or maybe: Complexify declaratively to simplify procedurally.
But there's precious little room for the consolations of philosophy when you're swamped in code and up to your assembler in allomorphism. So, rather than having to maintain those complex and repetitive parameter lists, we might prefer to factor out the common infrastructure. With, of course, yet another macro:
macro PART_PARAMS {
my ($sheep,$goats) = request(2 default labels);
return "Str +\@labels is dim(2) = <<$sheep $goats>>, *\@data";
}
multi sub part (Selector $is_sheep, PART_PARAMS) {
# body as before
}
multi sub part (Int @is_sheep, PART_PARAMS) {
# body as before
}
Here we create a macro named &PART_PARAMS that requests and
extracts the default labels and then interpolates them into a string, which it
returns. That string then replaces the original macro call.
Note that we reused the &request macro within the
&PART_PARAMS macro. That's important, because it means that,
as the body of &PART_PARAMS is itself being parsed, the
default names are requested and interpolated into
&PART_PARAMS's code. That ensures that the user-supplied
default labels are hardwired into &PART_PARAMS even before
it's compiled. So every subsequent call to PART_PARAMS will return
the same default labels.
On the other hand, if we'd written &PART_PARAMS like
this:
macro PART_PARAMS {
print "Enter 2 default labels: ";
my ($sheep,$goats) = split(/\s+/, <>, 3);
return "*\@data, Str +\@labels is dim(2) = <<$sheep $goats>>";
}
then each time we used the &PART_PARAMS macro in our code,
it would re-prompt for the labels. So we could give each variant of
&part its own default labels. Either approach is fine,
depending on the effect we want to achieve. It's really just a question how
much work we're willing to put in in order to be Lazy.
Smooth Operators
By now it's entirely possible that your head is spinning with the sheer
number of ways Perl 6 lets us implement the &part subroutine.
Each of those ways represents a different tradeoff in power, flexibility, and
maintainability of the resulting code. It's important to remember that, however
we choose to implement &part, it's always invoked in basically
the same way:
%parts = part $selector, @data;
Sure, some of the above techniques let us modify the return labels, or control the use of named vs positional arguments. But with all of them, the call itself starts with the name of the subroutine, after which we specify the arguments.
Let's change that too!
Suppose we preferred to have a partitioning operator, rather than a subroutine. If we ignore those optional labels, and restrict our list to be an actual array, we can see that the core partitioning operation is binary ("apply this selector to that array").
If &part is to become an operator, we need it to be a
binary operator. In Perl 6 we can make up completely new operators, so let's
take our partitioning inspiration from Moses and call our new operator:
~|_|~
We'll assume that this "Red Sea" operator is to be used like this:
%parts = @animals ~|_|~ Animal::Cat;
The left operand is the array to be partitioned and the right operand is the selector. To implement it, we'd write;
multi sub infix:~|_|~ (@data, Selector $is_sheep)
is looser(&infix:+)
is assoc('non')
{
return part $is_sheep, @data;
}
Operators are often overloaded with multiple variants (as we'll soon see), so we typically implement them as multisubs. However, it's also perfectly possible to implement them as regular subroutines, or even as macros.
To distinguish a binary operator from a regular multisub, we give it a
special compound name, composed of the keyword infix: followed by
the characters that make up the operator's symbol. These characters can be any
sequence of non-whitespace Unicode characters (except left parenthesis, which
can only appear if it's the first character of the symbol). So instead of
~|_|~ we could equally well have named our partitioning operator
any of:
infix:¥
infix:¦
infix:^%#$!
infix:<->
infix:∇
The infix: keyword tells the compiler that the operator is
placed between its operands (as binary operators always are). If we're
declaring a unary operator, there are three other keywords that can be used
instead: prefix:, postfix:, or
circumfix:. For example:
sub prefix:± (Num $n) is equiv(&infix:+) { return +$n|-$n }
sub postfix:² (Num $n) is tighter(&infix:**) { return $n**2 }
sub circumfix:⌊...⌋ (Num $n) { return POSIX::floor($n) }
# and later...
$error = ±⌊$x²⌋;
The is tighter, is looser, and is
equiv traits tell the parser what the precedence of the new operator
will be, relative to existing operators: namely, whether the operator binds
more tightly than, less tightly than, or with the same precedence as the
operator named in the trait. Every operator has to have a precedence and
associativity, so every operator definition has to include one of these three
traits.
The is assoc trait is only required on infix operators and
specifies whether they chain to the left (like +), to the right
(like =), or not at all (like ..). If the trait is
not specified, the operator takes its associativity from the operator that's
specified in the is tighter, is looser, or is
equiv trait.
Arguments Both Ways
On the other hand, we might prefer that the selector come first (as it does
in &part):
%parts = Animal::Cat ~|_|~ @animals;
in which case we could just add:
multi sub infix:~|_|~ (Selector $is_sheep, @data)
is equiv( &infix:~|_|~(Array,Selector) )
{
return part $is_sheep, @data;
}
so now we can specify the selector and the data in either order.
Because the two variants of the &infix:~|_|~ multisubs have
different parameter lists (one is (Array,Selector), the other is
(Selector, Array), Perl 6 always knows which one to call. If the
left operand is a Selector, the
&infix:~|_|~(Selector,Array) variant is called. If the left
operand is an array, the &infix:~|_|~(Array,Selector) variant
is invoked.
Note that, for this second variant, we specified is equiv
instead of is tighter or is looser. This ensures that
the precedence and associativity of the second variant are the same as those of
the first. That's also why we didn't need to specify an is
assoc.
Parting Is Such Sweet Sorrow
Phew. Talk about "more than one way to do it"!
But don't be put off by these myriad new features and alternatives. The vast majority of them are special-purpose, power-user techniques that you may well never need to use or even know about.
For most of us it will be enough to know that we can now add a proper parameter list, with sensibly named parameters, to any subroutine. What we used to write as:
sub feed {
my ($who, $how_much, @what) = @_;
...
}
we now write as:
sub feed ($who, $how_much, *@what) {
...
}
or, when we're feeling particularly cautious:
sub feed (Str $who, Num $how_much, Food *@what) {
...
}
Just being able to do that is a huge win for Perl 6.
Parting Shot
By the way, here's (most of) that same partitioning functionality implemented in Perl 5:
# Perl 5 code...
sub part {
my ($is_sheep, $maybe_flag_or_labels, $maybe_labels, @data) = @_;
my ($sheep, $goats);
if ($maybe_flag_or_labels eq "labels" && ref $maybe_labels eq 'ARRAY') {
($sheep, $goats) = @$maybe_labels;
}
elsif (ref $maybe_flag_or_labels eq 'ARRAY') {
unshift @data, $maybe_labels;
($sheep, $goats) = @$maybe_flag_or_labels;
}
else {
unshift @data, $maybe_flag_or_labels, $maybe_labels;
($sheep, $goats) = qw(sheep goats);
}
my $arg1_type = ref($is_sheep) || 'CLASS';
my %herd;
if ($arg1_type eq 'ARRAY') {
for my $index (0..$#data) {
my $datum = $data[$index];
my $label = grep({$index==$_} @$is_sheep) ? $sheep : $goats;
push @{$herd{$label}}, $datum;
}
}
else {
croak "Invalid first argument to &part"
unless $arg1_type =~ /^(Regexp|CODE|HASH|CLASS)$/;
for (@data) {
if ( $arg1_type eq 'Regexp' && /$is_sheep/
|| $arg1_type eq 'CODE' && $is_sheep->($_)
|| $arg1_type eq 'HASH' && $is_sheep->{$_}
|| UNIVERSAL::isa($_,$is_sheep)
) {
push @{$herd{$sheep}}, $_;
}
else {
push @{$herd{$goats}}, $_;
}
}
}
return map {bless {key=>$_,value=>$herd{$_}},'Pair'} keys %herd;
}
... which is precisely why we're developing Perl 6.

