Exegesis 6
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.
TMTOWTDeclareI
While we're talking about type declarations, it's worth noting that we could
also have put &part's new return type out in front (just as
we've been doing with variable and parameter types). However, this is only
allowed for subroutines when the subroutine is explicitly scoped:
# lexical subroutine
my List of Pair sub part (Selector $is_sheep, *@data) {...}
or:
# package subroutine
our List of Pair sub part (Selector $is_sheep, *@data) {...}
The return type goes between the scoping keyword (my or
our) and the sub keyword. And, of course, the
returns keyword is not used.
Contrariwise, we can also put variable/parameter type information
after the variable name. To do that, we use the of
keyword:
my sub part ($is_sheep of Selector, *@data) returns List of Pair {...}
This makes sense, when you think about it. As we saw above, of
tells the preceding container what type of value it's supposed to store, so
$is_sheep of Selector tells $is_sheep it's supposed
to store a Selector.
You Are What You Eat -- Not!
Careful though: we have to remember to use of there, not
is. It would be a mistake to write:
my sub part ($is_sheep is Selector, *@data) returns List of Pair {...}
That's because Perl 6 variables and parameters can be more precisely typed than variables in most other languages. Specifically, Perl 6 allows us to specify both the storage type of a variable (i.e. what kinds of values it can contain) and the implementation class of the variable (i.e. how the variable itself is actually implemented).
The is keyword indicates what a particular container (variable,
parameter, etc.) is — namely, how it's implemented and how it
operates. Saying:
sub bark(@dogs is Pack) {...}
specifies that, although the @dogs parameter looks like an
Array, it's actually implemented by the Pack class
instead.
That declaration is not specifying that the
@dogs variable stores Pack objects. In fact,
it's not saying anything at all about what @dogs stores. Since its
storage type has been left unspecified, @dogs inherits the default
storage type — Any — which allows its elements to
store any kind of scalar value.
If we'd wanted to specify that @dogs was a normal array, but
that it can only store Dog objects, we'd need to write:
sub bark(@dogs of Dog) {...}
and if we'd wanted it to store Dogs but be
implemented by the Pack class, we'd have to write:
sub bark(@dogs is Pack of Dog) {...}
Appending is SomeType to a variable or parameter is the Perl 6
equivalent of Perl 5's tie mechanism, except that the tying is
part of the declaration. For example:
my $Elvis is King of Rock&Roll;
rather than a run-time function call like:
# Perl 5 code...
my $Elvis;
tie $Elvis, 'King', stores=>all('Rock','Roll');
In any case, the simple rule for of vs is is:
to say what a variable stores, use of; to say how the variable
itself works, use is.
Many Happy Returns
Meanwhile, we're still attempting to create a version of
&part that returns a list of pairs. The easiest way to create
and return a suitable list of pairs is to flatten a hash in a list context.
This is precisely what the return statement does:
return *%herd;
using the splatty star. Although, in this case, we could have simply written:
return %herd;
since the declared return type (List of Pair) automatically
imposes list context (and hence list flattening) on any return
statement within &part.
Of course, it will only make sense to return a flattened hash if we've
already partitioned the original data into that hash. So the bodies of the
when and default statements inside
&part have to be changed accordingly. Now, instead of pushing
each element onto one of two separate arrays, we push each element onto one of
the two arrays stored inside %herd:
for @data {
when $is_sheep { push %herd{"sheep"}, $_ }
default { push %herd{"goats"}, $_ }
}
It Lives!!!!!
Assuming that each of the hash entries (%herd{"sheep"} and
%herd{"goats"}) will be storing a reference to one of the two
arrays, we can simply push each data element onto the appropriate array.
In Perl 5 we'd have to dereference each of the array references inside our hash before we could push a new element onto it:
# Perl 5 code...
push @{$herd{"sheep"}}, $_;
But in Perl 6, the first parameter of push expects an array, so
if we give it an array reference, the interpreter can work out that it needs to
dereference that first argument. So we can just write:
# Perl 6 code...
push %herd{"sheep"}, $_;
(Remember that, in Perl 6, hashes keep their % sigil, even when
being indexed).
Initially, of course, the entries of %herd don't contain
references to arrays at all; like all uninitialized hash entries, they contain
undef. But, because push itself is defined like
so:
sub push (@array is rw, *@data) {...}
an actual read-writable array is expected as the first argument. If a
scalar variable containing undef is passed to such a parameter,
Perl 6 detects the fact and autovivifies the necessary array, placing a
reference to it into the previously undefined scalar argument. That behaviour
makes it trivially easy to create subroutines that autovivify read/write
arguments, in the same way that Perl 5's open does.
It's also possible to declare a read/write parameter that doesn't
autovivify in this way: using the is ref trait instead of is
rw:
sub push_only_if_real_array (@array is ref, *@data) {...}
is ref still allows the parameter to be read from and written
to, but throws an exception if the corresponding argument isn't already a real
referent of some kind.
A Label by Any Other Name
Mandating fixed labels for the two arrays being returned seems a little inflexible, so we could add another — optional — parameter via which user-selected key names could be passed...
sub part (Selector $is_sheep,
Str ?@labels is dim(2) = <<sheep goats>>,
*@data
) returns List of Pair
{
my ($sheep, $goats) is constant = @labels;
my %herd = ($sheep=>[], $goats=>[]);
for @data {
when $is_sheep { push %herd{$sheep}, $_ }
default { push %herd{$goats}, $_ }
}
return *%herd;
}
Optional parameters in Perl 6 are prefixed with a ? marker
(just as slurpy parameters are prefixed with *). Like required
parameters, optional parameters are passed positionally, so the above example
means that the second argument is expected to be an array of strings. This has
important consequences for backwards compatibility — as we'll see
shortly.
As well as declaring it to be optional (using a leading ?), we
also declare the @labels parameter to have exactly two elements,
by specifying the is dim(2) trait. The is dim trait
takes one or more integer values. The number of values it's given specifies the
number of dimensions the array has; the values themselves specify how many
elements long the array is in each dimension. For example, to create a
four-dimensional array of 7x24x60x60 elements, we'd declare it:
my @seconds is dim(7,24,60,60);
In the latest version of &part, the @labels is
dim(2) declaration means that @labels is a normal
one-dimensional array, but that it has only two elements in that one
dimension.
The final component of the declaration of @labels is the
specification of its default value. Any optional parameter may be given a
default value, to which it will be bound if no corresponding argument is
provided. The default value can be any expression that yields a value
compatible with the type of the optional parameter.
In the above version of &part, for the sake of backwards
compatibility we make the optional @labels default to the list of
two strings <<sheep goats>> (using the new
Perl 6 list-of-strings syntax).
Thus if we provide an array of two strings explicitly, the two strings we
provide will be used as keys for the two pairs returned. If we don't specify
the labels ourselves, "sheep" and "goats" will be
used.
Name Your Poison
With the latest version of &part defined to return named
pairs, we can now write:
@parts = part Animal::Cat, <<cat chattel>>, @animals;
# returns: (cat=>[...], chattel=>[...])
# instead of: (sheep=>[...], goats=>[...])
The first argument (Animal::Cat) is bound to
&part's $is_sheep parameter (as before). The
second argument (<<cat chattel>>) is now bound to the
optional @labels parameter, leaving the @animals
argument to be flattened into a list and slurped up by the @data
parameter.
We could also pass some or all of the arguments as named arguments. A named argument is simply a Perl 6 pair, where the key is the name of the intended parameter, and the value is the actual argument to be bound to that parameter. That makes sense: every parameter we ever declare has to have a name, so there's no good reason why we shouldn't be allowed to pass it an argument using that name to single it out.
An important restriction on named arguments is that they cannot come before positional arguments, or after any arguments that are bound to a slurpy array. Otherwise, there would be no efficient, single-pass way of working out which unnamed arguments belong to which parameters. Apart from that one overarching restriction (which Larry likes to think of as a zoning law), we're free to pass named arguments in any order we like. That's a huge advantage in any subroutine that takes a large number of parameters, because it means we no longer have to remember their order, just their names.
For example, using named arguments we could rewrite the above
part call as any of the following:
# Use named argument to pass optional @labels argument...
@parts = part Animal::Cat, labels => <<cat chattel>>, @animals;
# Use named argument to pass both @labels and @data arguments...
@parts = part Animal::Cat, labels => <<cat chattel>>, data => @animals;
# The order in which named arguments are passed doesn't matter...
@parts = part Animal::Cat, data => @animals, labels => <<cat chattel>>;
# Can pass *all* arguments by name...
@parts = part is_sheep => Animal::Cat,
labels => <<cat chattel>>,
data => @animals;
# And the order still doesn't matter...
@parts = part data => @animals,
labels => <<cat chattel>>,
is_sheep => Animal::Cat;
# etc.
As long as we never put a named argument before a positional argument, or after any unnamed data for the slurpy array, the named arguments can appear in any convenient order. They can even be pulled out of a flattened hash:
@parts = part *%args;

