Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 3
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 03 for the latest information.

RFC 285: Lazy Input / Context-sensitive Input

Solving this with want() is the wrong approach, but I think the basic idea is sound because it's what people expect. And the want() should in fact be unnecessary. Essentially, if the right side of a list assignment produces a lazy list, and the left side requests a finite number of elements, the list generator will only produce enough to satisy the demand. It doesn't need to know how many in advance. It just produces another scalar value when requested. The generator doesn't have to be smart about its context. The motto of a lazy list generator should be, ``Ours is not to question why, ours is but to do (the next one) or die.''

It will be tricky to make this one work right:

    ($first, @rest) = 1 .. Inf;

RFC 082: Arrays: Apply operators element-wise in a list context

APL, here we come... :-)

This is by far the most difficult of these RFCs to decide, so I'm going to be doing a lot of thinking out loud here. This is research--or at least, a search. Please bear with me.

Related Reading

Programming Perl, 3rd Edition

Programming Perl, 3rd Edition
By Larry Wall, Tom Christiansen, Jon Orwant

I expect that there are two classes of Perl programmers--those that would find these ``hyper'' operators natural, and those that wouldn't. Turning this feature on by default would cause a lot of heartburn for people who (from Perl 5 experience) expect arrays to always return their length under scalar operators even in list context. It can reasonably be argued that we need to make the scalar operators default, but make it easy to turn on hyper operators within a lexical scope. In any event, both sets of operators need to be visible from anywhere--we're just arguing over who gets the short, traditional names. All operators will presumably have longer names for use as function calls anyway. Instead of just naming an operator with long names like:

    operator:+
    operator:/

the longer names could distinguish ``hyperness'' like this:

    @a scalar:+ @b
    @a list:/ @b

That implies they could also be called like this:

    scalar:+(@a, @b)
    list:/(@a, @b)

We might find some short prefix character stands in for ``list'' or ``scalar''. The obvious candidates are @ and $:

    @a $+ @b
    @a @/ @b

Unfortunately, in this case, ``obvious'' is synonymous with ``wrong''. These operators would be completely confusing from a visual point of view. If the main psychological point of putting noun markers on the nouns is so that they stand out from the verbs, then you don't want to put the same markers on the verbs. It would be like the Germans starting to capitalize all their words instead of just their nouns.

Instead, we could borrow a singular/plural memelet from shell globbing, where * means multiple characters, and ? means one character:

    @a ?+ @b
    @a */ @b

But that has a bad ambiguity. How do you tell whether ** is an exponentiation or a list multiplication? So if we went that route, we'd probably have to say:

    @a ?:+ @b
    @a *:/ @b

Or some such. But if we're going that far in the direction of gobbledygook, perhaps there are prefix characters that wouldn't be so ambiguous. The colon and the dot also have a visual singular/plural value:

    @a .+ @b
    @a :/ @b

We're already changing the old meaning of dot (and I'm planning to rescue colon from the ?: operator), so perhaps that could be made to work. You could almost think of dot and colon as complementary method calls, where you could say:

    $len = @a.length;   # length as a scalar operator
    @len = @a:length;   # length as a list operator

But that would interfere with other desirable uses of colon. Plus, it's actually going to be confusing to think of these as singular and plural operators because, while we're specifying that we want a ``plural'' operator, we're not specifying how to treat the plurality. Consider this:

    @len = list:length(@a);

Anyone would naively think that returns the length of the list, not the length of each element of the list. To make it work in English, we'd actually have to say something like this:

    @len = each:length(@a);
    $len = the:length(@a);

That would be equivalent to the method calls:

    @len = @a.each:length;
    $len = @a.the:length;

But does this really mean that there are two array methods with those weird names? I don't think so. We've reached a result here that is spectacularly close to a reductio ad absurdum. It seems to me that the whole point of this RFC is that the ``eachness'' is most simply specified by the list context, together with the knowledge that length() is a function/method that maps one scalar value to another. The distribution of that function over an array value is not something the scalar function should be concerned with, except insofar as it must make sure its type signature is correct.

And there's the rub. We're really talking about enforced strong typing for this to work right. When we say:

    @foo = @bar.mumble

How do we know whether mumble has the type signature that magically enables iteration over @bar? That definition is off in some other file that we may not have memorized quite yet. We need some more explicit syntax that says that auto-interation is expected, regardless of whether the definition of the operator is well specified. Magical auto-iteration is not going to work well in a language with optional typing.

So the resolution of this is that the unmarked forms of operators will force scalar context as they do in Perl 5, and we'll need a special marker that says an operator is to be auto-iterated. That special marker turns out to be an uparrow, with a tip o' the hat to higher-order functions. That is, the hyper-operator:

    @a ^* @b

is equivalent to this:

    parallel { $^a * $^b } @a, @b

(where parallel is a hypothetical function that iterates through multiple arrays in parallel.)

Hyper operators will also intuit where a dimension is missing from one of its arguments, and replicate a scalar value to a list value in that dimension. That means you can say:

    @a ^+ 1

to get a value with one added to each element of @a. (@a is unchanged.)

I don't believe there are any insurmountable ambiguities with the uparrow notation. There is currently an uparrow operator meaning exclusive-or, but that is rarely used in practice, and is not typically followed by other operators when it is used. We can represent exclusive-or with ~ instead. (I like that idea anyway, because the unary ~ is a 1's complement, and the binary ~ would simply be doing a 1's complement on the second argument of the set bits in the first argument. On the other hand, there's destructive interference with other cultural meanings of tilde, so it's not completely obvious that it's the right thing to do. Nevertheless, that's what we're doing.)

Anyway, in essence, I'm rejecting the underlying premise of this RFC, that we'll have strong enough typing to intuit the right behavior without confusing people. Nevertheless, we'll still have easy-to-use (and more importantly, easy-to-recognize) hyper-operators.

This RFC also asks about how return values for functions like abs() might be specified. I expect sub declarations to (optionally) include a return type, so this would be sufficient to figure out which functions would know how to map a scalar to a scalar. And we should point out again that even though the base language will not try to intuit which operators should be hyperoperators, there's no reason in principle that someone couldn't invent a dialect that does. All is fair if you predeclare.

RFC 045: || and && should propagate result context to both sides

Yes. The thing that makes this work in Perl 6, where it was almost impossible in Perl 5, is that in Perl 6, list context doesn't imply immediate list flattening. More precisely, it specifies immediate list flattening in a notional sense, but the implementation is free to delay that flattening until it's actually required. Internally, a flattened list is still an object. So when @a || @b evaluates the arrays, they're evaluated as objects that can return either a boolean value or a list, depending on the context. And it will be possible to apply both contexts to the first argument simultaneously. (Of course, the computer actually looks at it in the boolean context first.)

There is no conflict with RFC 81 because the hyper versions of these operators will be spelled:

    @a ^|| @b
    @a ^&& @b

Pages: 1, 2, 3, 4, 5, 6

Next Pagearrow