Exegesis 6
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.
Who Gets the Last Piece of Cake?
We're making progress. Whether we pass its arguments by name or
positionally, our call to part produces two partitions of the
original list. Those partitions now come back with convenient labels that we
can specify via the optional @labels parameter.
But now there's a problem. Even though we explicitly marked it as optional,
it turns out that things can go horribly wrong if we don't actually supply that
optional argument. Which is not very "optional". Worse, it means there's
potentially a problem with every single legacy call to part that
was coded before we added the optional parameter.
For example, consider the call:
@pets = ('Canis latrans', 'Felis sylvestris');
@parts = part /:i felis/, @pets;
# expected to return: (sheep=>['Felis sylvestris'], goats=>['Canis latrans'] )
# actually returns: ('Canis latrans'=>[], 'Felis sylvestris'=>[])
What went wrong?
Well, when the call to part is matching its argument list
against &call's parameter list, it works left-to-right as
follows:
- The first parameter (
$is_sheep) is declared as a scalar of typeSelector, so the first argument must be aCodeor aClassor aHashor aRule. It's actually aRule, so the call mechanism binds that rule to$is_sheep. - The second parameter (
?@labels) is declared as an array of two strings, so the second argument must be an array of two strings.@petsis an array of two strings, so we bind that array to@labels. (Oops!) - The third parameter (
*@data) is declared as a slurpy array, so any remaining arguments should be flattened and bound to successive elements of@data. There are no remaining arguments, so there's nothing to flatten-and-bind, so@dataremains empty.
That's the problem. If we pass the arguments positionally and there are not enough of them to bind to every parameter, the parameters at the start of the parameter list are bound before those towards the end. Even if those earlier parameters are marked optional. In other words, argument binding is "greedy" and (for obvious efficiency reasons) it never backtracks to see if there might be better ways to match arguments to parameters. Which means, in this case, that our data is being preemptively "stolen" by our labels.
Pipeline to the Rescue!
So in general (and in the above example in particular) we need some way of indicating that a positional argument belongs to the slurpy data, not to some preceding optional parameter. One way to do that is to pass the ambiguous argument by name:
@parts = part /:i felis/, data=>@pets;
Then there can be no mistake about which argument belongs to what parameter.
But there's also a purely positional way to tell the call to
part that @pets belongs to the slurpy
@data, not to the optional @labels. We can pipeline
it directly there. After all, that's precisely what the pipeline operator does:
it binds the list on its blunt side to the slurpy array parameter of the call
on its sharp side. So we could just write:
@parts = part /:i felis/ <== @pets;
# returns: (sheep=>['Felis sylvestris'], goats=>['Canis latrans'])
Because @pets now appears on the blunt end of a pipeline,
there's no way it can be interpreted as anything other than the slurped data
for the call to part.
A Natural Assumption
Of course, as a solution to the problem of legacy code, this is highly
sub-optimal. It requires that every single pre-existing call to
part be modified (by having a pipeline inserted). That will
almost certainly be too painful.
Our new optional labels would be much more useful if their existence itself
were also optional — if we could somehow add a single statement to the
start of any legacy code file and thereby cause &part to work
like it used to in the good old days before labels. In other words, what we
really want is an impostor &part subroutine that pretends that
it only has the original two parameters ($is_sheep and
@data), but then when it's called surreptitiously supplies an
appropriate value for the new @label parameter and quietly calls
the real &part.
In Perl 6, that's easy. All we need is a good curry.
We write the following at the start of the file:
use List::Part; # Supposing &part is defined in this module
my &part ::= &List::Part::part.assuming(labels => <<sheep goats>>)
That second line is a little imposing so let's break it down. First of all:
List::Part::part
is just the fully qualified name of the &part subroutine
that's defined in the List::Part module (which, for the purposes
of this example, is where we're saying &part lives). So:
&List::Part::part
is the actual Code object corresponding to the
&part subroutine. So:
&List::Part::part.assuming(...)
is a method call on that Code object. This is the tricky bit,
but it's no big deal really. If a Code object really is an object,
we certainly ought to be able to call methods on it. So:
&List::Part::part.assuming(labels => <<sheep goats>>)
calls the assuming method of the Code object
&part and passes the assuming method a named
argument whose name is labels and whose value is the list of
strings <<sheep goats>>.
Now, if we only knew what the .assuming method did...
That About Wraps it Up
What the .assuming(...) method does is place an anonymous
wrapper around an existing Code object and then return a reference
to (what appears to be) an entirely separate Code object. That new
Code object works exactly like the original — except that
the new one is missing one or more of the original's parameters.
Specifically, the parameter list of the wrapper subroutine doesn't have any
of the parameters that were named in in the call to .assuming.
Instead those missing parameters are automatically filled in whenever the new
subroutine is called, using the values of those named arguments to
.assuming.
All of which simply means that the method call:
&List::Part::part.assuming(labels => <<sheep goats>>)
returns a reference to a new subroutine that acts like this:
sub ($is_sheep, *@data) {
return part($is_sheep, labels=><<sheep goats>>, *@data)
}
That is, because we passed a
labels => <<sheep goats>>
argument to .assuming, we get back a subroutine without a
labels parameter, but which then just calls part and
inserts the value <<sheep goats>> for the
missing parameter.
Or, as the code itself suggests:
&List::Part::part.assuming(labels => <<sheep goats>>)
gives us what &List::Part::part would become under the
assumption that the value of @labels is always
<<sheep goats>> .
How does that help with our source code backwards compatibility problem? It
completely solves it. All we have to do is to make Perl 6 use that carefully
wrapped, two-parameter version of &part in all our legacy
code, instead of the full three-parameter one. To do that, we merely create a
lexical subroutine of the same name and bind the wrapped version to that
lexical:
my &part ::= &List::Part::part.assuming(labels => <<sheep goats>>);
The my &part declares a lexical subroutine named
&part (in exactly the same way that a my $part
would declare a lexical variable named $part). The my
keyword says that it's lexical and the sigil says what kind of thing it is
(& for subroutine, in this case). Then we simply install the
wrapped version of &List::Part::part as the implementation of
the new lexical &part and we're done.
Just as lexical variables hide package or global variables of the same name,
so too a lexical subroutine hides any package or global subroutine of the same
name. So my &part hides the imported
&List::Part::part, and every subsequent call to
part(...) in the rest of the current scope calls the lexical
&part instead.
Because that lexical version is bound to a label-assuming wrapper, it
doesn't have a labels parameter, so none of the legacy calls to
&part are broken. Instead, the lexical &part
just silently "fills in" the labels parameter with the value we
originally gave to .assuming.
If we needed to add another partitioning call within the scope of that
lexical &part, but we wanted to use those sexy new non-default
labels, we could do so by calling the actual three-parameter
&part via its fully qualified name, like so:
@parts = List::Part::part(Animal::Cat, <<cat chattel>>, @animals);
Pair Bonding
One major advantage of having &part return a list of pairs
rather than a simple list of arrays is that now, instead of positional
binding:
# with original (list-of-arrays) version of &part...
(@cats, @chattels) := part Animal::Cat <== @animals;
we can do "named binding"
# with latest (list-of-pairs) version of &part...
(goats=>@chattels, sheep=>@cats) := part Animal::Cat <== @animals;
Named binding???
Well, we just learned that we can bind arguments to parameters by name, but
earlier we saw that parameter binding is merely an implicit form of explicit
:= binding. So the inevitable conclusion is that the only reason
we can bind parameters by name is because := supports named
binding.
And indeed it does. If a := finds a list of pairs on its
righthand side, and a list of simple variables on its lefthand side, it uses
named binding instead of positional binding. That is, instead of binding first
to first, second to second, etc., the := uses the key of each
righthand pair to determine the name of the variable on its left to which the
value of the pair should be bound.
That sounds complicated, but the effect is very easy to understand:
# Positional binding...
($who, $why) := ($because, "me");
# same as: $who := $because; $why := "me";
# Named binding...
($who, $why) := (why => $because, who => "me");
# same as: $who := "me"; $why := $because;
Even more usefully, if the binding operator detects a list of pairs on its left and another list of pairs on its right, it binds the value of the first pair on the right to the value of the identically named pair on the left (again, regardless of where the two pairs appear in their respective lists). Then it binds the value of the second pair on the right to the value of the identically named pair on the left, and so on.
That means we can set up a named := binding in which the names
of the bound variables don't even have to match the keys of the values being
bound to them:
# Explicitly named binding...
(who=>$name, why=>$reason) := (why => $because, who => "me");
# same as: $name := "me"; $reason := $because;
The most common use for that feature will probably be to create "free-standing" aliases for particular entries in a hash:
(who=>$name, why=>$reason) := *%explanation;
# same as: $name := %explanation{who}; $reason := %explanation{why};
or to convert particular hash entries into aliases for other variables:
*%details := (who=>"me", why=>$because);
# same as: %details{who} := "me", %details{why} := $because;

