Sign In/My Account | View Cart  
advertisement


Listen Print

Exegesis 6
by Damian Conway | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.

Refactoring Parameter Lists

By this stage, you might be justified in feeling that &part's parameter list is getting just a leeeeettle too sophisticated for its own good. Moreover, if we were using the multisub version, that complexity would have to be repeated in every variant.

Philosophically though, that's okay. The later versions of &part are doing some fairly sophisticated things, and the complexity required to achieve that has to go somewhere. Putting that extra complexity in the parameter list means that the body of &part stays much simpler, as do any calls to &part.

That's the whole point: Complexify locally to simplify globally. Or maybe: Complexify declaratively to simplify procedurally.

But there's precious little room for the consolations of philosophy when you're swamped in code and up to your assembler in allomorphism. So, rather than having to maintain those complex and repetitive parameter lists, we might prefer to factor out the common infrastructure. With, of course, yet another macro:

macro PART_PARAMS {
	my ($sheep,$goats) = request(2 default labels);
	return "Str +\@labels is dim(2) = <<$sheep $goats>>, *\@data";
}

multi sub part (Selector $is_sheep, PART_PARAMS) {
    # body as before
}

multi sub part (Int @is_sheep, PART_PARAMS) {
    # body as before
}

Here we create a macro named &PART_PARAMS that requests and extracts the default labels and then interpolates them into a string, which it returns. That string then replaces the original macro call.

Note that we reused the &request macro within the &PART_PARAMS macro. That's important, because it means that, as the body of &PART_PARAMS is itself being parsed, the default names are requested and interpolated into &PART_PARAMS's code. That ensures that the user-supplied default labels are hardwired into &PART_PARAMS even before it's compiled. So every subsequent call to PART_PARAMS will return the same default labels.

On the other hand, if we'd written &PART_PARAMS like this:

macro PART_PARAMS {
	print "Enter 2 default labels: ";
	my ($sheep,$goats) = split(/\s+/, <>, 3);
	return "*\@data, Str +\@labels is dim(2) = <<$sheep $goats>>";
}

then each time we used the &PART_PARAMS macro in our code, it would re-prompt for the labels. So we could give each variant of &part its own default labels. Either approach is fine, depending on the effect we want to achieve. It's really just a question how much work we're willing to put in in order to be Lazy.

Smooth Operators

By now it's entirely possible that your head is spinning with the sheer number of ways Perl 6 lets us implement the &part subroutine. Each of those ways represents a different tradeoff in power, flexibility, and maintainability of the resulting code. It's important to remember that, however we choose to implement &part, it's always invoked in basically the same way:

%parts = part $selector, @data;

Sure, some of the above techniques let us modify the return labels, or control the use of named vs positional arguments. But with all of them, the call itself starts with the name of the subroutine, after which we specify the arguments.

Let's change that too!

Suppose we preferred to have a partitioning operator, rather than a subroutine. If we ignore those optional labels, and restrict our list to be an actual array, we can see that the core partitioning operation is binary ("apply this selector to that array").

If &part is to become an operator, we need it to be a binary operator. In Perl 6 we can make up completely new operators, so let's take our partitioning inspiration from Moses and call our new operator: ~|_|~

We'll assume that this "Red Sea" operator is to be used like this:

%parts = @animals ~|_|~ Animal::Cat;

The left operand is the array to be partitioned and the right operand is the selector. To implement it, we'd write;

multi sub infix:~|_|~ (@data, Selector $is_sheep)
    is looser(&infix:+)
    is assoc('non')
{
    return part $is_sheep, @data;
}

Operators are often overloaded with multiple variants (as we'll soon see), so we typically implement them as multisubs. However, it's also perfectly possible to implement them as regular subroutines, or even as macros.

To distinguish a binary operator from a regular multisub, we give it a special compound name, composed of the keyword infix: followed by the characters that make up the operator's symbol. These characters can be any sequence of non-whitespace Unicode characters (except left parenthesis, which can only appear if it's the first character of the symbol). So instead of ~|_|~ we could equally well have named our partitioning operator any of:

infix:¥
infix:¦
infix:^%#$!
infix:<->
infix:∇

The infix: keyword tells the compiler that the operator is placed between its operands (as binary operators always are). If we're declaring a unary operator, there are three other keywords that can be used instead: prefix:, postfix:, or circumfix:. For example:

sub prefix:±       (Num $n) is equiv(&infix:+)    { return +$n|-$n }

sub postfix:²      (Num $n) is tighter(&infix:**) { return $n**2 }

sub circumfix:⌊...⌋ (Num $n) { return POSIX::floor($n) }

# and later...

$error = ±⌊$x²⌋;

The is tighter, is looser, and is equiv traits tell the parser what the precedence of the new operator will be, relative to existing operators: namely, whether the operator binds more tightly than, less tightly than, or with the same precedence as the operator named in the trait. Every operator has to have a precedence and associativity, so every operator definition has to include one of these three traits.

The is assoc trait is only required on infix operators and specifies whether they chain to the left (like +), to the right (like =), or not at all (like ..). If the trait is not specified, the operator takes its associativity from the operator that's specified in the is tighter, is looser, or is equiv trait.

Arguments Both Ways

On the other hand, we might prefer that the selector come first (as it does in &part):

%parts = Animal::Cat ~|_|~ @animals;

in which case we could just add:

multi sub infix:~|_|~ (Selector $is_sheep, @data)
    is equiv( &infix:~|_|~(Array,Selector) )
{
    return part $is_sheep, @data;
}

so now we can specify the selector and the data in either order.

Because the two variants of the &infix:~|_|~ multisubs have different parameter lists (one is (Array,Selector), the other is (Selector, Array), Perl 6 always knows which one to call. If the left operand is a Selector, the &infix:~|_|~(Selector,Array) variant is called. If the left operand is an array, the &infix:~|_|~(Array,Selector) variant is invoked.

Note that, for this second variant, we specified is equiv instead of is tighter or is looser. This ensures that the precedence and associativity of the second variant are the same as those of the first. That's also why we didn't need to specify an is assoc.

Parting Is Such Sweet Sorrow

Phew. Talk about "more than one way to do it"!

But don't be put off by these myriad new features and alternatives. The vast majority of them are special-purpose, power-user techniques that you may well never need to use or even know about.

For most of us it will be enough to know that we can now add a proper parameter list, with sensibly named parameters, to any subroutine. What we used to write as:

sub feed {
    my ($who, $how_much, @what) = @_;
    ...
}

we now write as:

sub feed ($who, $how_much, *@what) {
    ...
}

or, when we're feeling particularly cautious:

sub feed (Str $who, Num $how_much, Food *@what) {
    ...
}

Just being able to do that is a huge win for Perl 6.

Parting Shot

By the way, here's (most of) that same partitioning functionality implemented in Perl 5:

# Perl 5 code...
sub part {
    my ($is_sheep, $maybe_flag_or_labels, $maybe_labels, @data) = @_;
    my ($sheep, $goats);
    if ($maybe_flag_or_labels eq "labels" && ref $maybe_labels eq 'ARRAY') { 
        ($sheep, $goats) = @$maybe_labels;
    }
    elsif (ref $maybe_flag_or_labels eq 'ARRAY') {
        unshift @data, $maybe_labels;
        ($sheep, $goats) = @$maybe_flag_or_labels;
    }
    else {
        unshift @data, $maybe_flag_or_labels, $maybe_labels;
        ($sheep, $goats) = qw(sheep goats);
    }
    my $arg1_type = ref($is_sheep) || 'CLASS';
    my %herd;
    if ($arg1_type eq 'ARRAY') {
        for my $index (0..$#data) {
            my $datum = $data[$index];
            my $label = grep({$index==$_} @$is_sheep) ? $sheep : $goats;
            push @{$herd{$label}}, $datum;
        }
    }
    else {
        croak "Invalid first argument to &part"
            unless $arg1_type =~ /^(Regexp|CODE|HASH|CLASS)$/;
        for (@data) {
            if (  $arg1_type eq 'Regexp' && /$is_sheep/
               || $arg1_type eq 'CODE'   && $is_sheep->($_)
               || $arg1_type eq 'HASH'   && $is_sheep->{$_}
               || UNIVERSAL::isa($_,$is_sheep)
               ) {
                push @{$herd{$sheep}}, $_;
            }
            else {
                push @{$herd{$goats}}, $_;
            }
        }
    }
    return map {bless {key=>$_,value=>$herd{$_}},'Pair'} keys %herd;
}

... which is precisely why we're developing Perl 6.