Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 6
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.

Appendix A: Rationale for pipe operators

As we pointed out in the text, the named form of passing a list has the disadvantage that you have to know what the formal parameter's name is. We could get around that by saying that a null name maps to the slurp array. In other words, we could define a => unary operator that creates a null key:


    stuff(@foo, =>(1,2,3))

We can at least lose the outer parens in this case:


    stuff @foo, =>(1,2,3)

But darn it, we can't get rid of those pesky inner parens because of the precedence of => with respect to comma. So perhaps it's time for a new operator with looser precedence than comma:


    stuff @foo *: 1,2,3         # * to match * zone marker
    stuff @foo +* 1,2,3         # put the * on the list side
    stuff @foo *=> 1,2,3        # or combine with => above
    stuff @foo ==> 1,2,3        # maybe just lengthen =>
    stuff @foo <== 1,2,3        # except the dataflow is to the left
    stuff @foo with 1,2,3       # could use a word

Whichever one we pick, it'd still probably want to construct a special pair internally, because we have to be able to use it indirectly:


    @args = (\@foo, '*@' => (1,2,3));
    stuff *@args;

But if we're going to have a special operator to switch explicitly to the list part, it really needs to earn its keep, and do more work. A special operator could also force scalar context on the left and list context on the right. So with implied scalar context we could omit the backslash above:


    @args = (@foo with 1,2,3);
    stuff *@args;

That's all well and good, and some language designers would stop right there, if not sooner. But if we think about this in relation to cascaded list operators, we'll see a different pattern emerging. Here's a left-to-right variant on the Schwartzian Transform:


    my @x := map {...} @input;
    my @y := sort {...} with @x;
    my @z := map {...} with @y;

When we think of data flowing left-to-right, it's more like a pipe operator from a shell, except that we're naming our pipes @x and @y. But it'd be nice not to have to name the temporary array values. If we do have a pipe operator in Perl, it's not going to be |, for two reasons. First, | is taken for junctions. Second, piping is a big, low-precedence operation, and I want a big fat operator that will show up to the eye. Of our candidate list above, I think the big, fat arrows really stand out, and look like directed pipes. So assuming we have the ==> operator to go with the <==, we could write our ST like this:


    @input     ==>
    map {...}  ==>
    sort {...} ==>
    map {...}  ==>
    push my @z;

That argues that the scalar-to-list transition operator should be <==:


    my @x := map {...} @input;
    my @y := sort {...} <== @x;
    my @z := map {...} <== @y;

And that means this should maybe dwim:


    @args = (@foo <== 1,2,3);
    stuff *@args;

Hmm.

That does imply that <== is (at least in this case) a data composition operator, unlike the ==> operator which merely sends the output of one function to the next. Maybe that's not a problem. But people might see:


    @x <== 1,2,3

and expect it does assignment when it in fact doesn't. Internally it would really do something more like appending a named argument:


    @x, '*@' => (1,2,3)

or however we decide to mark the beginning of the "real" list within a larger list.

But I do rather like the looks of:


    push @foo <== 1,2,3;

not to mention the symmetrical:


    1,2,3 ==>
    push @foo;

Note however that the pointy end of ==> must be bound to a function that takes a list. You can't say:


    1,2,3 ==>
    my @foo;

because you can't say:


    my @foo <== 1,2,3;

Or rather, you can, if we allow:


    (@foo <== 1,2,3)

but it would mean the Wrong Thing. Ouch. So maybe that should not be legal. The asymmetry was bugging me anyway.

So let's say that <== and ==> must always be bound on their pointy end to a slurpy function, and if you want to build an indirect argument list, you have to use some kind of explicit list function such as args:


    @args = args @foo <== 1,2,3;
    stuff *@args;

The args function would really be a no-op, much like other context enforcers such as scalar and list. In fact, I'd be tempted to just use list like this:


    @args = list @foo <== 1,2,3;

But unless we can get people to see <== as a strange kind of comma, that will likely be misread as:


    @args = list(@foo) <== 1,2,3;

when it's really this:


    @args = list(@foo <== 1,2,3);

On the other hand, using list would cut out the need for yet another built-in, for which there is much to be said... I'd say, let's go with list on the assumption that people will learn to read <== as a pipe comma. If someone wants to use args for clarity, they can always just alias list:


    my &args ::= &*list;

More likely, they'll just use the parenthesized form:


    @args = list(@foo <== 1,2,3);

I suppose there could also be a prefix unary form, in case they want to use it without scalar arguments:


    @args = list(<== 1,2,3);

or in case they want to put a comma after the scalar arguments:


    @args = list(@foo, <== 1,2,3);

In fact, it could be argued that we should only have the unary form, since in this:


    stan @array, ollie <== 1,2,3

it's visually ambiguous whether the pointy pipe belongs to stan or ollie. It could be ambiguous to the compiler as well. With a unary operator, it unambiguously belongs to ollie. You'd have to say:


    stan @array, ollie, <== 1,2,3

to make it belong to stan. And yet, it'd be really strange for a unary <== to force the arguments to its left into scalar context if the operator doesn't govern those arguments syntactically. And I still think I want <== to do that. And it's probably better to disambiguate with parentheses anyway. So we keep it a binary operator. There's no unary variant, either prefix or postfix. You can always say:


    list( () <== 1,2,3 )
    list( @foo <== () )

Similarly, ==> is also always a binary operator. As the reverse of <==, it forces its left side into list context, and it also forces all the arguments of the list operator on the right into scalar context. Just as:


    mumble @foo <== @bar

tells you that @foo is in scalar context and @bar is in list context regardless of the signature of mumble, so too:


    @bar ==>
    mumble @foo

tells you exactly the same thing. This is particularly useful when you have a method with an unknown signature that you have to dispatch on:


    @bar ==>
    $objects[$x].mumble(@foo)

The ==> unambiguously indicates that all the other arguments to mumble are in scalar context. It also allows mumble's signature to check to see if the number of scalar arguments is within the correct range, counting only required and optional parameters, since we don't have to allow for extra arguments to slop into the slurp array.

If we do want extra list arguments, we could conceivably allow both kinds of pipe at once:


    @bar ==>
    $objects[$x].mumble(@foo <== 1,2,3)

If we did that, it could be equivalent to either:


    $objects[$x].mumble(@foo <== 1,2,3,@bar)

or:


    $objects[$x].mumble(@foo <== @bar,1,2,3)

Since I can argue it both ways, we'll have to disallow it entirely. :-)

Seriously, the conservative thing to do is to disallow it until we know what we want it to mean, if anything.

On the perl6-language list, an operator was discussed that would do argument rearrangement, but this is a little different in that it is constrained (by default) to operate only with the slurpy list part of the input to a function. This is as it should be, if you think about it. When you pipe things around in Unix, you don't expect the command line switches to come in via the pipe, but from the command line. The scalar arguments of a list operator function as the command line, and the list argument functions as the pipe.

That being said, if you want to pull the scalar arguments from the front of the pipe, we already have a mechanism for that:


    @args = list(@foo <== 1,2,3);
    stuff *@args;

By extension, we also have this:


    list(@foo <== 1,2,3) ==>
      stuff *();

So there's no need for a special syntax to put the invocant after all the arguments. It's just this:


    list(@foo <== 1,2,3) ==>
     $object.stuff *();

Possibly the *() could be inferred in some cases, but it may be better not to if we can't do it consistently. If stuff's signature started with optional positional parameters, we wouldn't know whether the pipe starts with positional arguments or list elements. I think that passing positionals at the front of the pipe is rare enough that it ought to be specially marked with *(). Maybe we can reduce it to a *, like a unary that has an optional argument:


    list(@foo <== 1,2,3) ==>
     $object.stuff *;

By the way, you may think that we're being silly calling these pipes, since we're just passing lists around. But remember that these can potentially be lazy lists produced by a generator. Indeed, a common idiom might be something like:


    <$*IN> ==> process() ==> print;

which arguably reads better than:


    print process <$*IN>;

Another possibility is that we extend the argumentless * to mark where the list goes in constructs that take lists but aren't officially list operators:


    1,2,3 ==>
    my @foo = (*)

But maybe we should just make:


    1,2,3 ==> my @foo;

do what people will expect it to. Since we require the list operator for the other usage, it's easy enough to recognize that this is not a list operator, and that we should therefore assign it. It seems to have a kind of inevitability about it.

Damian: "Certainly, if we don't support it, someone (*ahem*) will immediately write:


    multi infix:==> (Lazy $list, @array is rw) { @array = $list }
    multi infix:<== (@array is rw, Lazy $list) { @array = $list }

"So we might as well make it standard."

On the other hand...

I'm suddenly wondering if assignment and binding can change precedence on the right like list operators do if it's known we're assigning to a list. I, despite my credentials as TheLarry, keep finding myself writing list assignments like this:


    my @foo := 0..9,'a'..'z';

Oops. But what if it wasn't an oops. What if that parsed like a list operator, and slurped up all the commas to the right? Parens would still be required around a list on the left though. And it might break weird things like:


    (@a = (1,2), @b = (3,4))

But how often do you do a list assignment inside a list? On the other hand, making list assignment a different precedence than scalar is weird. But it'd have to be that way if we still wanted:


    ($a = 1, $b = 2)

to work as a C programmer expects. Still, I think I like it. In particular, it'd let us write what we mean explicitly:


    1,2,3 ==>
    my @foo = *;

So let's go ahead and do that, and then maybe someone (*ahem*) might just forget to overload the pipe operators on arrays.*

* The words "fat", "slim", and "none" come to mind.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Next Pagearrow