Apocalypse 6
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.
Appendix A: Rationale for pipe operators
As we pointed out in the text, the named form of passing a list has
the disadvantage that you have to know what the formal parameter's
name is. We could get around that by saying that a null name maps
to the slurp array. In other words, we could define a =>
unary operator that creates a null key:
stuff(@foo, =>(1,2,3))
We can at least lose the outer parens in this case:
stuff @foo, =>(1,2,3)
But darn it, we can't get rid of those pesky inner parens because of
the precedence of => with respect to comma. So perhaps it's
time for a new operator with looser precedence than comma:
stuff @foo *: 1,2,3 # * to match * zone marker
stuff @foo +* 1,2,3 # put the * on the list side
stuff @foo *=> 1,2,3 # or combine with => above
stuff @foo ==> 1,2,3 # maybe just lengthen =>
stuff @foo <== 1,2,3 # except the dataflow is to the left
stuff @foo with 1,2,3 # could use a word
Whichever one we pick, it'd still probably want to construct a special pair internally, because we have to be able to use it indirectly:
@args = (\@foo, '*@' => (1,2,3));
stuff *@args;
But if we're going to have a special operator to switch explicitly to the list part, it really needs to earn its keep, and do more work. A special operator could also force scalar context on the left and list context on the right. So with implied scalar context we could omit the backslash above:
@args = (@foo with 1,2,3);
stuff *@args;
That's all well and good, and some language designers would stop right there, if not sooner. But if we think about this in relation to cascaded list operators, we'll see a different pattern emerging. Here's a left-to-right variant on the Schwartzian Transform:
my @x := map {...} @input;
my @y := sort {...} with @x;
my @z := map {...} with @y;
When we think of data flowing left-to-right, it's more like a pipe
operator from a shell, except that we're naming our pipes @x
and @y. But it'd be nice not to have to name the temporary
array values. If we do have a pipe operator in Perl, it's not going
to be |, for two reasons. First, | is taken for junctions.
Second, piping is a big, low-precedence operation, and I want a big
fat operator that will show up to the eye. Of our candidate list
above, I think the big, fat arrows really stand out, and look like
directed pipes. So assuming we have the ==> operator to go
with the <==, we could write our ST like this:
@input ==>
map {...} ==>
sort {...} ==>
map {...} ==>
push my @z;
That argues that the scalar-to-list transition operator should be <==:
my @x := map {...} @input;
my @y := sort {...} <== @x;
my @z := map {...} <== @y;
And that means this should maybe dwim:
@args = (@foo <== 1,2,3);
stuff *@args;
Hmm.
That does imply that <== is (at least in this case) a data
composition operator, unlike the ==> operator which merely sends
the output of one function to the next. Maybe that's not a problem.
But people might see:
@x <== 1,2,3
and expect it does assignment when it in fact doesn't. Internally it would really do something more like appending a named argument:
@x, '*@' => (1,2,3)
or however we decide to mark the beginning of the "real" list within a larger list.
But I do rather like the looks of:
push @foo <== 1,2,3;
not to mention the symmetrical:
1,2,3 ==>
push @foo;
Note however that the pointy end of ==> must be bound to a
function that takes a list. You can't say:
1,2,3 ==>
my @foo;
because you can't say:
my @foo <== 1,2,3;
Or rather, you can, if we allow:
(@foo <== 1,2,3)
but it would mean the Wrong Thing. Ouch. So maybe that should not be legal. The asymmetry was bugging me anyway.
So let's say that <== and ==> must always be bound on
their pointy end to a slurpy function, and if you want to build an
indirect argument list, you have to use some kind of explicit list
function such as args:
@args = args @foo <== 1,2,3;
stuff *@args;
The args function would really be a no-op, much like other context
enforcers such as scalar and list. In fact, I'd be tempted to
just use list like this:
@args = list @foo <== 1,2,3;
But unless we can get people to see <== as a strange kind of
comma, that will likely be misread as:
@args = list(@foo) <== 1,2,3;
when it's really this:
@args = list(@foo <== 1,2,3);
On the other hand, using list would cut out the need for yet another
built-in, for which there is much to be said... I'd say, let's go
with list on the assumption that people will learn to read <==
as a pipe comma. If someone wants to use args for clarity,
they can always just alias list:
my &args ::= &*list;
More likely, they'll just use the parenthesized form:
@args = list(@foo <== 1,2,3);
I suppose there could also be a prefix unary form, in case they want to use it without scalar arguments:
@args = list(<== 1,2,3);
or in case they want to put a comma after the scalar arguments:
@args = list(@foo, <== 1,2,3);
In fact, it could be argued that we should only have the unary form, since in this:
stan @array, ollie <== 1,2,3
it's visually ambiguous whether the pointy pipe belongs to stan or
ollie. It could be ambiguous to the compiler as well. With a unary
operator, it unambiguously belongs to ollie. You'd have to say:
stan @array, ollie, <== 1,2,3
to make it belong to stan. And yet, it'd be really strange for a
unary <== to force the arguments to its left into scalar context
if the operator doesn't govern those arguments syntactically. And I
still think I want <== to do that. And it's probably better to
disambiguate with parentheses anyway. So we keep it a binary operator.
There's no unary variant, either prefix or postfix. You can always say:
list( () <== 1,2,3 )
list( @foo <== () )
Similarly, ==> is also always a binary operator. As the
reverse of <==, it forces its left side into list context,
and it also forces all the arguments of the list operator on the
right into scalar context. Just as:
mumble @foo <== @bar
tells you that @foo is in scalar context and @bar is in list
context regardless of the signature of mumble, so too:
@bar ==>
mumble @foo
tells you exactly the same thing. This is particularly useful when you have a method with an unknown signature that you have to dispatch on:
@bar ==>
$objects[$x].mumble(@foo)
The ==> unambiguously indicates that all the other arguments to
mumble are in scalar context. It also allows mumble's signature
to check to see if the number of scalar arguments is within the correct
range, counting only required and optional parameters, since we don't
have to allow for extra arguments to slop into the slurp array.
If we do want extra list arguments, we could conceivably allow both kinds of pipe at once:
@bar ==>
$objects[$x].mumble(@foo <== 1,2,3)
If we did that, it could be equivalent to either:
$objects[$x].mumble(@foo <== 1,2,3,@bar)
or:
$objects[$x].mumble(@foo <== @bar,1,2,3)
Since I can argue it both ways, we'll have to disallow it
entirely. :-)
Seriously, the conservative thing to do is to disallow it until we know what we want it to mean, if anything.
On the perl6-language list, an operator was discussed that would do argument rearrangement, but this is a little different in that it is constrained (by default) to operate only with the slurpy list part of the input to a function. This is as it should be, if you think about it. When you pipe things around in Unix, you don't expect the command line switches to come in via the pipe, but from the command line. The scalar arguments of a list operator function as the command line, and the list argument functions as the pipe.
That being said, if you want to pull the scalar arguments from the front of the pipe, we already have a mechanism for that:
@args = list(@foo <== 1,2,3);
stuff *@args;
By extension, we also have this:
list(@foo <== 1,2,3) ==>
stuff *();
So there's no need for a special syntax to put the invocant after all the arguments. It's just this:
list(@foo <== 1,2,3) ==>
$object.stuff *();
Possibly the *() could be inferred in some cases, but it may be
better not to if we can't do it consistently. If stuff's signature
started with optional positional parameters, we wouldn't know whether
the pipe starts with positional arguments or list elements. I think
that passing positionals at the front of the pipe is rare enough that
it ought to be specially marked with *(). Maybe we can reduce it
to a *, like a unary that has an optional argument:
list(@foo <== 1,2,3) ==>
$object.stuff *;
By the way, you may think that we're being silly calling these pipes, since we're just passing lists around. But remember that these can potentially be lazy lists produced by a generator. Indeed, a common idiom might be something like:
<$*IN> ==> process() ==> print;
which arguably reads better than:
print process <$*IN>;
Another possibility is that we extend the argumentless * to mark
where the list goes in constructs that take lists but aren't officially
list operators:
1,2,3 ==>
my @foo = (*)
But maybe we should just make:
1,2,3 ==> my @foo;
do what people will expect it to. Since we require the list
operator for the other usage, it's easy enough to recognize that this
is not a list operator, and that we should therefore assign it.
It seems to have a kind of inevitability about it.
Damian: "Certainly, if we don't support it, someone (*ahem*) will immediately write:
multi infix:==> (Lazy $list, @array is rw) { @array = $list }
multi infix:<== (@array is rw, Lazy $list) { @array = $list }
"So we might as well make it standard."
On the other hand...
I'm suddenly wondering if assignment and binding can change precedence on the right like list operators do if it's known we're assigning to a list. I, despite my credentials as TheLarry, keep finding myself writing list assignments like this:
my @foo := 0..9,'a'..'z';
Oops. But what if it wasn't an oops. What if that parsed like a list operator, and slurped up all the commas to the right? Parens would still be required around a list on the left though. And it might break weird things like:
(@a = (1,2), @b = (3,4))
But how often do you do a list assignment inside a list? On the other hand, making list assignment a different precedence than scalar is weird. But it'd have to be that way if we still wanted:
($a = 1, $b = 2)
to work as a C programmer expects. Still, I think I like it. In particular, it'd let us write what we mean explicitly:
1,2,3 ==>
my @foo = *;
So let's go ahead and do that, and then maybe someone (*ahem*) might just forget to overload the pipe operators on arrays.*
- * The words "fat", "slim", and "none" come to mind.
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 |

