Exegesis 6
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.
Defining the Parameters
Meanwhile, back at the &part, we have:
sub part (Code $is_sheep, *@data) {...}
which means that &part expects its first argument to be a
scalar value of type Code (or Code reference). Within
the subroutine that first argument will thereafter be accessed via the name
$is_sheep.
The second parameter (*@data) is what's known as a "slurpy
array". That is, it's an array parameter with the special marker
(*) in front of it, indicating to the compiler that
@data is supposed to grab all the remaining arguments passed to
&part and make each element of @data an alias to
one of those arguments.
In other words, the *@data parameter does just what
@_ does in Perl 5: it grabs all the available arguments and makes
its elements aliases for those arguments. The only differences are that in Perl
6 we're allowed to give that slurpy array a sensible name, and we're allowed to
specify other individual parameters before it — to give separate sensible
names to one or more of the preliminary arguments to the call.
But why (you're probably wondering) do we need an asterisk for that? Surely
if we had defined &part like this:
sub part (Code $is_sheep, @data) {...} # note: no asterisk on @data
the array in the second parameter slot would have slurped up all the remaining arguments anyway.
Well, no. Declaring a parameter to be a regular (non-slurpy) array tells the
subroutine to expect the corresponding argument to be a actual array (or an
array reference). So if &part had been defined with its second
parameter just @data (rather than *@data), then we
could call it like this:
part \&selector, @animal_sounds;
or this:
part \&selector, ["woof","meow","ook!"];
but not like this:
part \&selector, "woof", "meow", "ook!";
In each case, the compiler would compare the type of the second argument
with the type required by the second parameter (i.e. an Array). In
the first two cases, the types match and everything is copacetic. In the third
case, the second argument is a string, not an array or array reference, so we
get a compile-time error message:
Type mismatch in call to &part: @data expects Array but got Str instead
Another way of thinking about the difference between slurpy and regular parameters is to realize that a slurpy parameter imposes a list (i.e. flattening) context on the corresponding arguments, whereas a regular, non-slurpy parameter doesn't flatten or listify. Instead, it insists on a single argument of the correct type.
So, if we want &part to handle raw lists as data, we need
to tell the @data parameter to take whatever it finds —
array or list — and flatten everything down to a list. That's what the
asterisk on *@data does.
Because of that all-you-can-eat behaviour, slurpy arrays like this are generally placed at the very end of the parameter list and used to collect data for the subroutine. The preceding non-slurpy arguments generally tell the subroutine what to do; the slurpy array generally tells it what to do it to.
Splats and Slurps
Another aspect of Perl 6's distinction between slurpy and non-slurpy parameters can be seen when we write a subroutine that takes multiple scalar parameters, then try to pass an array to that subroutine.
For example, suppose we wrote:
sub log($message, $date, $time) {...}
If we happen to have the date and time in a handy array, we might expect
that we could just call log like so:
log("Starting up...", @date_and_time);
We might then be surprised when this fails even to compile.
The problem is that each of &log's three scalar parameters
imposes a scalar context on the corresponding argument in any call to
log. So "Starting up..." is first evaluated in the
scalar context imposed by the $message parameter and the resulting
string is bound to $message. Then @date_and_time is
evaluated in the scalar context imposed by $date, and the
resulting array reference is bound to $date. Then the compiler
discovers that there is no third argument to bind to the $time
parameter and kills your program.
Of course, it has to work that way, or we don't get the
ever-so-useful "array parameter takes an unflattened array argument" behaviour
described earlier. Unfortunately, that otherwise admirable behaviour is
actually getting in the way here and preventing @date_and_time
from flattening as we want.
So Perl 6 also provides a simple way of explicitly flattening an array (or a
hash for that matter): the unary prefix * operator:
log("Starting up...", *@date_and_time);
This operator (known as "splat") simply flattens its argument into a list. Since it's a unary operator, it does that flattening before the arguments are bound to their respective parameters.
The syntactic similarity of a "slurpy" * in a parameter list,
and a "splatty" * in an argument list is quite deliberate. It
reflects a behavioral similarity: just as a slurpy asterisk implicitly
flattens any argument to which its parameter is bound, so too a splatty
asterisk explicitly flattens any argument to which it is applied.
I Do Declare
By the way, take another look at those examples above — the ones with
the {...} where their subroutine bodies should be. Those dots
aren't just metasyntactic; they're real executable Perl 6 code. A subroutine
definition with a {...} for its body isn't actually a
definition at all. It's a declaration.
In the same way that the Perl 5 declaration:
# Perl 5 code...
sub part;
states that there exists a subroutine &part, without
actually saying how it's implemented, so too:
# Perl 6 code...
sub part (Code $is_sheep, *@data) {...}
states that there exists a subroutine &part that takes a
Code object and a list of data, without saying how it's
implemented. In fact, the old sub part; syntax is no longer
allowed; in Perl 6 you have to yada-yada-yada when you're making a
declaration.
Body Parts
With the parameter list taking care of getting the right arguments into the
right parameters in the right way, the body of the &part
subroutine is then quite straightforward:
{
my (@sheep, @goats);
for @data {
if $is_sheep($_) { push @sheep, $_ }
else { push @goats, $_ }
}
return (\@sheep, \@goats);
}
According to the original specification, we need to return references to two
arrays. So we first create those arrays. Then we iterate through each element
of the data (which the for aliases to $_, just as in
Perl 5). For each element, we take the Code object that was
passed as $is_sheep (let's just call it the selector from
now on) and we call it, passing the current data element. If the selector
returns true, we push the data element onto the array of "sheep", otherwise it
is appended to the list of "goats". Once all the data has been divvied up, we
return references to the two arrays.
Note that, if this were Perl 5, we'd have to unpack the @_
array into a list of lexical variables and then explicitly check that
$is_sheep is a valid Code object. In the Perl 6
version there's no @_, the parameters are already lexicals, and
the type-checking is handled automatically.
Call of the Wild
With the explicit parameter list in place, we can use &part
in a variety of ways. If we already have a subroutine that is a suitable
test:
sub is_feline ($animal) {
return $animal.isa(Cat);
}
then we can just pass that to &part, along with the data to
be partitioned, then grab the two array references that come back:
($cats, $chattels) = part &is_feline, @animals;
This works fine, because the first parameter of &part
expects a Code object, and that's exactly what
&is_feline is. Note that we couldn't just put
is_feline there (i.e. without the ampersand), since that would
indicate a call to &is_feline, rather than a
reference to it.
In Perl 5 we'd have had to write \&is_feline to get a
reference to the subroutine. However, since the $is_sheep
parameter specifies that the first argument must be a scalar (i.e. it imposes a
scalar context on the first argument slot), in Perl 6 we don't have to create a
subroutine reference explicitly. Putting a code object in the scalar context
auto-magically enreferences it (just as an array or hash is automatically
converted to a reference in scalar context). Of course, an explicit
Code reference is perfectly acceptable there too:
($cats, $chattels) = part \&is_feline, @animals;
Alternatively, rather than going to the trouble of declaring a separate subroutine to sort our sheep from our goats, we might prefer to conjure up a suitable (anonymous) subroutine on the spot:
($cats, $chattels) = part sub ($animal) { $animal.isa(Animal::Cat) }, @animals;
In a Bind
So far we've always captured the two array references returned from the
part call by assigning the result of the call to a list of
scalars. But we might instead prefer to bind them to actual arrays:
(@cats, @chattels) := part sub($animal) { $animal.isa(Animal::Cat) }, @animals;
Using binding (:=) instead of assignment (=)
causes @cats and @chattels to become aliases for the
two anonymous arrays returned by &part.
In fact, this aliasing of the two return values to @cats and
@chattels uses exactly the same mechanism that is used to
alias subroutine parameters to their corresponding arguments. We could almost
think of the lefthand side of the := as a parameter list (in this
case, consisting of two non-slurpy array parameters), and the righthand side
of the := as being the corresponding argument list. The only
difference is that the variables on the lefthand side of a := are
not implicitly treated as constant.
One consequence of the similarities between binding and parameter passing is that we can put a slurpy array on the left of a binding:
(@Good, $Bad, *@Ugly) := (@Adams, @Vin, @Chico, @OReilly, @Lee, @Luck, @Britt);
The first pseudo-parameter (@Good) on the left expects an
array, so it binds to @Adams from the list on the right.
The second pseudo-parameter ($Bad) expects a scalar. That means
it imposes a scalar context on the second element of the righthand list. So
@Vin evaluates to a reference to the original array and
$Bad becomes an alias for \@Vin.
The final pseudo-parameter (*@Ugly) is slurpy, so it expects
the rest of the lefthand side to be a list it can slurp up. In order to ensure
that, the slurpy asterisk causes the remaining pseudo-arguments on the right to
be flattened into a list, whose elements are then aliased to successive
elements of @Ugly.

