Sign In/My Account | View Cart  
advertisement


Listen Print

Exegesis 6
by Damian Conway | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Editor's note: this document is out of date and remains here for historic interest. See Synopsis 6 for the current design information.

Who Gets the Last Piece of Cake?

We're making progress. Whether we pass its arguments by name or positionally, our call to part produces two partitions of the original list. Those partitions now come back with convenient labels that we can specify via the optional @labels parameter.

But now there's a problem. Even though we explicitly marked it as optional, it turns out that things can go horribly wrong if we don't actually supply that optional argument. Which is not very "optional". Worse, it means there's potentially a problem with every single legacy call to part that was coded before we added the optional parameter.

For example, consider the call:

@pets = ('Canis latrans', 'Felis sylvestris');

@parts = part /:i felis/, @pets;

# expected to return: (sheep=>['Felis sylvestris'], goats=>['Canis latrans'] )
# actually returns:   ('Canis latrans'=>[], 'Felis sylvestris'=>[])

What went wrong?

Well, when the call to part is matching its argument list against &call's parameter list, it works left-to-right as follows:

  1. The first parameter ($is_sheep) is declared as a scalar of type Selector, so the first argument must be a Code or a Class or a Hash or a Rule. It's actually a Rule, so the call mechanism binds that rule to $is_sheep.
  2. The second parameter (?@labels) is declared as an array of two strings, so the second argument must be an array of two strings. @pets is an array of two strings, so we bind that array to @labels. (Oops!)
  3. The third parameter (*@data) is declared as a slurpy array, so any remaining arguments should be flattened and bound to successive elements of @data. There are no remaining arguments, so there's nothing to flatten-and-bind, so @data remains empty.

That's the problem. If we pass the arguments positionally and there are not enough of them to bind to every parameter, the parameters at the start of the parameter list are bound before those towards the end. Even if those earlier parameters are marked optional. In other words, argument binding is "greedy" and (for obvious efficiency reasons) it never backtracks to see if there might be better ways to match arguments to parameters. Which means, in this case, that our data is being preemptively "stolen" by our labels.

Pipeline to the Rescue!

So in general (and in the above example in particular) we need some way of indicating that a positional argument belongs to the slurpy data, not to some preceding optional parameter. One way to do that is to pass the ambiguous argument by name:

@parts = part /:i felis/, data=>@pets;

Then there can be no mistake about which argument belongs to what parameter.

But there's also a purely positional way to tell the call to part that @pets belongs to the slurpy @data, not to the optional @labels. We can pipeline it directly there. After all, that's precisely what the pipeline operator does: it binds the list on its blunt side to the slurpy array parameter of the call on its sharp side. So we could just write:

@parts = part /:i felis/ <== @pets;

# returns: (sheep=>['Felis sylvestris'], goats=>['Canis latrans'])

Because @pets now appears on the blunt end of a pipeline, there's no way it can be interpreted as anything other than the slurped data for the call to part.

A Natural Assumption

Of course, as a solution to the problem of legacy code, this is highly sub-optimal. It requires that every single pre-existing call to part be modified (by having a pipeline inserted). That will almost certainly be too painful.

Our new optional labels would be much more useful if their existence itself were also optional — if we could somehow add a single statement to the start of any legacy code file and thereby cause &part to work like it used to in the good old days before labels. In other words, what we really want is an impostor &part subroutine that pretends that it only has the original two parameters ($is_sheep and @data), but then when it's called surreptitiously supplies an appropriate value for the new @label parameter and quietly calls the real &part.

In Perl 6, that's easy. All we need is a good curry.

We write the following at the start of the file:

use List::Part;   # Supposing &part is defined in this module

my &part ::= &List::Part::part.assuming(labels => <<sheep goats>>)

That second line is a little imposing so let's break it down. First of all:

List::Part::part

is just the fully qualified name of the &part subroutine that's defined in the List::Part module (which, for the purposes of this example, is where we're saying &part lives). So:

&List::Part::part

is the actual Code object corresponding to the &part subroutine. So:

&List::Part::part.assuming(...)

is a method call on that Code object. This is the tricky bit, but it's no big deal really. If a Code object really is an object, we certainly ought to be able to call methods on it. So:

&List::Part::part.assuming(labels => <<sheep goats>>)

calls the assuming method of the Code object &part and passes the assuming method a named argument whose name is labels and whose value is the list of strings <<sheep goats>>.

Now, if we only knew what the .assuming method did...

That About Wraps it Up

What the .assuming(...) method does is place an anonymous wrapper around an existing Code object and then return a reference to (what appears to be) an entirely separate Code object. That new Code object works exactly like the original — except that the new one is missing one or more of the original's parameters.

Specifically, the parameter list of the wrapper subroutine doesn't have any of the parameters that were named in in the call to .assuming. Instead those missing parameters are automatically filled in whenever the new subroutine is called, using the values of those named arguments to .assuming.

All of which simply means that the method call:

&List::Part::part.assuming(labels => <<sheep goats>>)

returns a reference to a new subroutine that acts like this:

sub ($is_sheep, *@data) {
    return part($is_sheep, labels=><<sheep goats>>, *@data)
}

That is, because we passed a labels => <<sheep goats>>  argument to .assuming, we get back a subroutine without a labels parameter, but which then just calls part and inserts the value <<sheep goats>>  for the missing parameter.

Or, as the code itself suggests:

&List::Part::part.assuming(labels => <<sheep goats>>)

gives us what &List::Part::part would become under the assumption that the value of @labels is always <<sheep goats>> .

How does that help with our source code backwards compatibility problem? It completely solves it. All we have to do is to make Perl 6 use that carefully wrapped, two-parameter version of &part in all our legacy code, instead of the full three-parameter one. To do that, we merely create a lexical subroutine of the same name and bind the wrapped version to that lexical:

my &part ::= &List::Part::part.assuming(labels => <<sheep goats>>);

The my &part declares a lexical subroutine named &part (in exactly the same way that a my $part would declare a lexical variable named $part). The my keyword says that it's lexical and the sigil says what kind of thing it is (& for subroutine, in this case). Then we simply install the wrapped version of &List::Part::part as the implementation of the new lexical &part and we're done.

Just as lexical variables hide package or global variables of the same name, so too a lexical subroutine hides any package or global subroutine of the same name. So my &part hides the imported &List::Part::part, and every subsequent call to part(...) in the rest of the current scope calls the lexical &part instead.

Because that lexical version is bound to a label-assuming wrapper, it doesn't have a labels parameter, so none of the legacy calls to &part are broken. Instead, the lexical &part just silently "fills in" the labels parameter with the value we originally gave to .assuming.

If we needed to add another partitioning call within the scope of that lexical &part, but we wanted to use those sexy new non-default labels, we could do so by calling the actual three-parameter &part via its fully qualified name, like so:

@parts = List::Part::part(Animal::Cat, <<cat chattel>>, @animals);

Pair Bonding

One major advantage of having &part return a list of pairs rather than a simple list of arrays is that now, instead of positional binding:

# with original (list-of-arrays) version of &part...
(@cats, @chattels) := part Animal::Cat <== @animals;

we can do "named binding"

# with latest (list-of-pairs) version of &part...
(goats=>@chattels, sheep=>@cats) := part Animal::Cat <== @animals;

Named binding???

Well, we just learned that we can bind arguments to parameters by name, but earlier we saw that parameter binding is merely an implicit form of explicit := binding. So the inevitable conclusion is that the only reason we can bind parameters by name is because := supports named binding.

And indeed it does. If a := finds a list of pairs on its righthand side, and a list of simple variables on its lefthand side, it uses named binding instead of positional binding. That is, instead of binding first to first, second to second, etc., the := uses the key of each righthand pair to determine the name of the variable on its left to which the value of the pair should be bound.

That sounds complicated, but the effect is very easy to understand:

# Positional binding...
($who, $why) := ($because, "me");
# same as: $who := $because; $why := "me";

# Named binding...
($who, $why) := (why => $because, who => "me");
# same as: $who := "me"; $why := $because;

Even more usefully, if the binding operator detects a list of pairs on its left and another list of pairs on its right, it binds the value of the first pair on the right to the value of the identically named pair on the left (again, regardless of where the two pairs appear in their respective lists). Then it binds the value of the second pair on the right to the value of the identically named pair on the left, and so on.

That means we can set up a named := binding in which the names of the bound variables don't even have to match the keys of the values being bound to them:

# Explicitly named binding...
(who=>$name, why=>$reason) := (why => $because, who => "me");
# same as: $name := "me"; $reason := $because;

The most common use for that feature will probably be to create "free-standing" aliases for particular entries in a hash:

(who=>$name, why=>$reason) := *%explanation;
# same as: $name := %explanation{who}; $reason := %explanation{why};

or to convert particular hash entries into aliases for other variables:

*%details := (who=>"me", why=>$because);
# same as: %details{who} := "me", %details{why} := $because;

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

Next Pagearrow