Exegesis 6
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.
An Argument in Name Only
It's pretty cool that Perl 6 automatically lets us specify positional arguments — and even return values — by name rather than position.
But what if we'd prefer that some of our arguments could only be
specified by name. After all, the @labels parameter isn't really
in the same league as the $is_sheep parameter: it's only an option
after all, and one that most people probably won't use. It shouldn't really be
a positional parameter at all.
We can specify that the labels argument is
only to be passed by name...by changing the previous declaration of the
@labels parameter very slightly:
sub part (Selector $is_sheep,
Str +@labels is dim(2) = <<sheep goats>>,
*@data
) returns List of Pair
{
my ($sheep, $goats) is constant = @labels;
my %herd = ($sheep=>[], $goats=>[]);
for @data {
when $is_sheep { push %herd{$sheep}, $_ }
default { push %herd{$goats}, $_ }
}
return *%herd;
}
In fact, there's only a single character's worth of difference in the whole
definition. Whereas before we declared the @labels parameter like
this:
Str ?@labels is dim(2) = <<sheep goats>>
now we declare it like this:
Str +@labels is dim(2) = <<sheep goats>>
Changing that ? prefix to a + changes
@labels from an optional positional-or-named parameter to an
optional named-only parameter. Now if we want to pass in a labels
argument, we can only pass it by name. Attempting to pass it positionally will
result in some extreme prejudice from the compiler.
Named-only parameters are still optional parameters however, so legacy code that omits the labels:
%parts = part Animal::Cat <== @animals;
still works fine (and still causes the @labels parameter to
default to <<sheep goats>>).
Better yet, converting @labels from a positional to a
named-only parameter also solves the problem of legacy code of the form:
%parts = part Animals::Cat, @animals;
@animals can't possibly be intended for the
@labels parameter now. We explicitly specified that labels can
only be passed by name, and the @animals argument isn't named.
So named-only parameters give us a clean way of upgrading a subroutine and still supporting legacy code. Indeed, in many cases the only reasonable way to add a new parameter to an existing, widely used, Perl 6 subroutine will be to add it as a named-only parameter.
Careful with that Arg, Eugene!
Of course, there's no free lunch here. The cost of solving the legacy code problem is that we changed the meaning of any more recent code like this:
%parts = part Animal::Cat, <<cat chattel>>, @animals; # Oops!
When @labels was positional-or-named, the
<<cat chattel>> argument could only be
interpreted as being intended for @labels. But now, there's no way
it can be for @labels (because it isn't named), so Perl 6 assumes
that the list is just part of the slurped data. The two-element list will now
be flattened (along with @animals), resulting in a single list
that is then bound to the @data parameter, as if we'd written:
%parts = part Animal::Cat <== 'cat', 'chattel', @animals;
This is yet another reason why named-only should probably be the first choice for optional parameters.
Temporal Life Insurance
Being able to add name-only parameters to existing subroutines is an
important way of future-proofing any calls to the subroutine. So long as we
continue to add only named-only parameters to &part, the order
in which the subroutine expects its positional and slurpy arguments will be
unchanged, so every existing call to part will continue to work
correctly.
Curiously, the reverse is also true. Named-only parameters also provide us with a way to "history-proof" subroutine calls. That is, we can allow a subroutine to accept named arguments that it doesn't (yet) know how to handle! Like so:
sub part (Selector $is_sheep,
Str +@labels is dim(2) = <<sheep goats>>
*%extras, # <-- NEW PARAMETER ADDED HERE
*@data,
) returns List of Pair
{
# Handle extras...
carp "Ignoring unknown named parameter '$_'" for keys %extras;
# Remainder of subroutine as before...
my ($sheep, $goats) is constant = @labels;
my %herd = ($sheep=>[], $goats=>[]);
for @data {
when $is_sheep { push %herd{$sheep}, $_ }
default { push %herd{$goats}, $_ }
}
return *%herd;
}
# and later...
%parts = part Animal::Cat, label=><<Good Bad>>, max=>3, @data;
# warns: "Ignoring unknown parameter 'max' at future.pl, line 19"
The *%extras parameter is a "slurpy hash". Just as the slurpy
array parameter (*@data) sucks up any additional positional
arguments for which there's no explicit parameter, a slurpy hash sucks up any
named arguments that are unaccounted for. In the above example, for instance,
&part has no $max parameter, so passing the named
argument max=>3 would normally produce a (compile-time)
exception:
Invalid named parameter ('max') in call to &part
However, because &part now has a slurpy hash, that
extraneous named argument is simply bound to the appropriate entry of
%extras and (in this example) used to generate a warning.
The more common use of such slurpy hashes is to capture the named arguments that are passed to an object constructor and have them automatically forwarded to the constructors of the appropriate ancestral classes. We'll explore that technique in Exegesis 12.
The Greatest Thing Since Sliced Arrays
So far we've progressively extended &part from the first
simple version that only accepted subroutines as selectors, to the most recent
versions that can now also use classes, rules, or hashes to partition their
data.
Suppose we also wanted to allow the user to specify a list of integer
indices as the selector, and thereby allow &part to separate a
slice of data from its "anti-slice". In other words, instead of:
%data{2357} = [ @data[2,3,5,7] ];
%data{other} = [ @data[0,1,4,6,8..@data-1] ];
we could write:
%data = part [2,3,5,7], labels=>["2357","other"], @data;
We could certainly extend &part to do that:
type Selector ::= Code | Class | Rule | Hash | (Array of Int);
sub part (Selector $is_sheep,
Str +@labels is dim(2) = <<sheep goats>>,
*@data
) returns List of Pair
{
my ($sheep, $goats) is constant = @labels;
my %herd = ($sheep=>[], $goats=>[]);
if $is_sheep.isa(Array of Int) {
for @data.kv -> $index, $value {
if $index == any($is_sheep) { push %herd{$sheep}, $value }
else { push %herd{$goats}, $value }
}
}
else {
for @data {
when $is_sheep { push %herd{$sheep}, $_ }
default { push %herd{$goats}, $_ }
}
}
return *%herd;
}
# and later, if there's a prize for finishing 1st, 2nd, 3rd, or last...
%prize = part [0, 1, 2, @horses-1],
labels => << placed also_ran >>,
@horses;
Note that this is the first time we couldn't just add another class to the
Selector type and rely on the smart-match inside the
when to work out how to tell "sheep" from "goats". The problem
here is that when the selector is an array of integers, the value of
each data element no longer determines its sheepishness/goatility. It's now the
element's position (i.e. its index) that decides its fate. Since our
existing smart-match compares values, not positions, the when
can't pick out the right elements for us. Instead, we have to consider both
the index and the value of each data element.
To do that we use the @data array's .kv method.
Just as calling the .kv method on a hash returns key,
value, key, value, key, value,
etc., so too calling the .kv method on an array returns
index, value, index, value, index,
value, etc. Then we just use a parameterized block as our
for block, specifying that it has two arguments. That causes the
for to grab two elements of the list its iterating (i.e. one index
and one value) on each iteration.
Then we simply test to see if the current index is any of those specified in
$is_sheep's array and, if so, we push the corresponding value:
for @data.kv -> $index, $value {
if $index == any(@$is_sheep) { push %herd{$sheep}, $value }
else { push %herd{$goats}, $value }
}

