Sign In/My Account | View Cart  
advertisement


Listen Print

Exegesis 6
by Damian Conway | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.

TMTOWTDeclareI

While we're talking about type declarations, it's worth noting that we could also have put &part's new return type out in front (just as we've been doing with variable and parameter types). However, this is only allowed for subroutines when the subroutine is explicitly scoped:

# lexical subroutine
my List of Pair sub part (Selector $is_sheep, *@data) {...}

or:

# package subroutine
our List of Pair sub part (Selector $is_sheep, *@data) {...}

The return type goes between the scoping keyword (my or our) and the sub keyword. And, of course, the returns keyword is not used.

Contrariwise, we can also put variable/parameter type information after the variable name. To do that, we use the of keyword:

my sub part ($is_sheep of Selector, *@data) returns List of Pair {...}

This makes sense, when you think about it. As we saw above, of tells the preceding container what type of value it's supposed to store, so $is_sheep of Selector tells $is_sheep it's supposed to store a Selector.

You Are What You Eat -- Not!

Careful though: we have to remember to use of there, not is. It would be a mistake to write:

my sub part ($is_sheep is Selector, *@data) returns List of Pair {...}

That's because Perl 6 variables and parameters can be more precisely typed than variables in most other languages. Specifically, Perl 6 allows us to specify both the storage type of a variable (i.e. what kinds of values it can contain) and the implementation class of the variable (i.e. how the variable itself is actually implemented).

The is keyword indicates what a particular container (variable, parameter, etc.) is — namely, how it's implemented and how it operates. Saying:

sub bark(@dogs is Pack) {...}

specifies that, although the @dogs parameter looks like an Array, it's actually implemented by the Pack class instead.

That declaration is not specifying that the @dogs variable stores Pack objects. In fact, it's not saying anything at all about what @dogs stores. Since its storage type has been left unspecified, @dogs inherits the default storage type — Any — which allows its elements to store any kind of scalar value.

If we'd wanted to specify that @dogs was a normal array, but that it can only store Dog objects, we'd need to write:

sub bark(@dogs of Dog) {...}

and if we'd wanted it to store Dogs but be implemented by the Pack class, we'd have to write:

sub bark(@dogs is Pack of Dog) {...}

Appending is SomeType to a variable or parameter is the Perl 6 equivalent of Perl 5's tie mechanism, except that the tying is part of the declaration. For example:

my $Elvis is King of Rock&Roll;

rather than a run-time function call like:

# Perl 5 code...
my $Elvis;
tie $Elvis, 'King', stores=>all('Rock','Roll');

In any case, the simple rule for of vs is is: to say what a variable stores, use of; to say how the variable itself works, use is.

Many Happy Returns

Meanwhile, we're still attempting to create a version of &part that returns a list of pairs. The easiest way to create and return a suitable list of pairs is to flatten a hash in a list context. This is precisely what the return statement does:

return *%herd;

using the splatty star. Although, in this case, we could have simply written:

return %herd;

since the declared return type (List of Pair) automatically imposes list context (and hence list flattening) on any return statement within &part.

Of course, it will only make sense to return a flattened hash if we've already partitioned the original data into that hash. So the bodies of the when and default statements inside &part have to be changed accordingly. Now, instead of pushing each element onto one of two separate arrays, we push each element onto one of the two arrays stored inside %herd:

for @data {
    when $is_sheep { push %herd{"sheep"}, $_ }
    default        { push %herd{"goats"}, $_ }
}

It Lives!!!!!

Assuming that each of the hash entries (%herd{"sheep"} and %herd{"goats"}) will be storing a reference to one of the two arrays, we can simply push each data element onto the appropriate array.

In Perl 5 we'd have to dereference each of the array references inside our hash before we could push a new element onto it:

# Perl 5 code...
push @{$herd{"sheep"}}, $_;

But in Perl 6, the first parameter of push expects an array, so if we give it an array reference, the interpreter can work out that it needs to dereference that first argument. So we can just write:

# Perl 6 code...
push %herd{"sheep"}, $_;

(Remember that, in Perl 6, hashes keep their % sigil, even when being indexed).

Initially, of course, the entries of %herd don't contain references to arrays at all; like all uninitialized hash entries, they contain undef. But, because push itself is defined like so:

sub push (@array is rw, *@data) {...}

an actual read-writable array is expected as the first argument. If a scalar variable containing undef is passed to such a parameter, Perl 6 detects the fact and autovivifies the necessary array, placing a reference to it into the previously undefined scalar argument. That behaviour makes it trivially easy to create subroutines that autovivify read/write arguments, in the same way that Perl 5's open does.

It's also possible to declare a read/write parameter that doesn't autovivify in this way: using the is ref trait instead of is rw:

sub push_only_if_real_array (@array is ref, *@data) {...}

is ref still allows the parameter to be read from and written to, but throws an exception if the corresponding argument isn't already a real referent of some kind.

A Label by Any Other Name

Mandating fixed labels for the two arrays being returned seems a little inflexible, so we could add another — optional — parameter via which user-selected key names could be passed...

sub part (Selector $is_sheep,
          Str ?@labels is dim(2) = <<sheep goats>>,
          *@data
         ) returns List of Pair
{
    my ($sheep, $goats) is constant = @labels;
    my %herd = ($sheep=>[], $goats=>[]);
    for @data {
        when $is_sheep { push %herd{$sheep}, $_ }
        default        { push %herd{$goats}, $_ }
    }
    return *%herd;
}

Optional parameters in Perl 6 are prefixed with a ? marker (just as slurpy parameters are prefixed with *). Like required parameters, optional parameters are passed positionally, so the above example means that the second argument is expected to be an array of strings. This has important consequences for backwards compatibility — as we'll see shortly.

As well as declaring it to be optional (using a leading ?), we also declare the @labels parameter to have exactly two elements, by specifying the is dim(2) trait. The is dim trait takes one or more integer values. The number of values it's given specifies the number of dimensions the array has; the values themselves specify how many elements long the array is in each dimension. For example, to create a four-dimensional array of 7x24x60x60 elements, we'd declare it:

my @seconds is dim(7,24,60,60);

In the latest version of &part, the @labels is dim(2) declaration means that @labels is a normal one-dimensional array, but that it has only two elements in that one dimension.

The final component of the declaration of @labels is the specification of its default value. Any optional parameter may be given a default value, to which it will be bound if no corresponding argument is provided. The default value can be any expression that yields a value compatible with the type of the optional parameter.

In the above version of &part, for the sake of backwards compatibility we make the optional @labels default to the list of two strings <<sheep goats>>  (using the new Perl 6 list-of-strings syntax).

Thus if we provide an array of two strings explicitly, the two strings we provide will be used as keys for the two pairs returned. If we don't specify the labels ourselves, "sheep" and "goats" will be used.

Name Your Poison

With the latest version of &part defined to return named pairs, we can now write:

@parts = part Animal::Cat, <<cat chattel>>, @animals;
#    returns: (cat=>[...], chattel=>[...])
# instead of: (sheep=>[...], goats=>[...])

The first argument (Animal::Cat) is bound to &part's $is_sheep parameter (as before). The second argument (<<cat chattel>>) is now bound to the optional @labels parameter, leaving the @animals argument to be flattened into a list and slurped up by the @data parameter.

We could also pass some or all of the arguments as named arguments. A named argument is simply a Perl 6 pair, where the key is the name of the intended parameter, and the value is the actual argument to be bound to that parameter. That makes sense: every parameter we ever declare has to have a name, so there's no good reason why we shouldn't be allowed to pass it an argument using that name to single it out.

An important restriction on named arguments is that they cannot come before positional arguments, or after any arguments that are bound to a slurpy array. Otherwise, there would be no efficient, single-pass way of working out which unnamed arguments belong to which parameters. Apart from that one overarching restriction (which Larry likes to think of as a zoning law), we're free to pass named arguments in any order we like. That's a huge advantage in any subroutine that takes a large number of parameters, because it means we no longer have to remember their order, just their names.

For example, using named arguments we could rewrite the above part call as any of the following:

# Use named argument to pass optional @labels argument...
@parts = part Animal::Cat, labels => <<cat chattel>>, @animals;

# Use named argument to pass both @labels and @data arguments...
@parts = part Animal::Cat, labels => <<cat chattel>>, data => @animals;

# The order in which named arguments are passed doesn't matter...
@parts = part Animal::Cat, data => @animals, labels => <<cat chattel>>;

# Can pass *all* arguments by name...
@parts = part is_sheep => Animal::Cat,
                labels => <<cat chattel>>,
                  data => @animals;

# And the order still doesn't matter...
@parts = part data => @animals,
              labels => <<cat chattel>>,
              is_sheep => Animal::Cat;

# etc.

As long as we never put a named argument before a positional argument, or after any unnamed data for the slurpy array, the named arguments can appear in any convenient order. They can even be pulled out of a flattened hash:

@parts = part *%args;

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Next Pagearrow