Apocalypse 3
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 03 for the latest information.
RFC 285: Lazy Input / Context-sensitive Input
Solving this with want() is the wrong approach, but I think the
basic idea is sound because it's what people expect. And the want()
should in fact be unnecessary. Essentially, if the right side of a
list assignment produces a lazy list, and the left side requests a
finite number of elements, the list generator will only produce enough
to satisy the demand. It doesn't need to know how many in advance.
It just produces another scalar value when requested. The generator
doesn't have to be smart about its context. The motto of a lazy list
generator should be, ``Ours is not to question why, ours is but to do
(the next one) or die.''
It will be tricky to make this one work right:
($first, @rest) = 1 .. Inf;
RFC 082: Arrays: Apply operators element-wise in a list context
APL, here we come... :-)
This is by far the most difficult of these RFCs to decide, so I'm going to be doing a lot of thinking out loud here. This is research--or at least, a search. Please bear with me.
|
Related Reading
Programming Perl, 3rd Edition |
I expect that there are two classes of Perl programmers--those that would find these ``hyper'' operators natural, and those that wouldn't. Turning this feature on by default would cause a lot of heartburn for people who (from Perl 5 experience) expect arrays to always return their length under scalar operators even in list context. It can reasonably be argued that we need to make the scalar operators default, but make it easy to turn on hyper operators within a lexical scope. In any event, both sets of operators need to be visible from anywhere--we're just arguing over who gets the short, traditional names. All operators will presumably have longer names for use as function calls anyway. Instead of just naming an operator with long names like:
operator:+
operator:/
the longer names could distinguish ``hyperness'' like this:
@a scalar:+ @b
@a list:/ @b
That implies they could also be called like this:
scalar:+(@a, @b)
list:/(@a, @b)
We might find some short prefix character stands in for
``list'' or ``scalar''. The obvious candidates are @ and $:
@a $+ @b
@a @/ @b
Unfortunately, in this case, ``obvious'' is synonymous with ``wrong''. These operators would be completely confusing from a visual point of view. If the main psychological point of putting noun markers on the nouns is so that they stand out from the verbs, then you don't want to put the same markers on the verbs. It would be like the Germans starting to capitalize all their words instead of just their nouns.
Instead, we could borrow a singular/plural memelet from shell
globbing, where * means multiple characters, and ? means one
character:
@a ?+ @b
@a */ @b
But that has a bad ambiguity. How do you tell whether **
is an exponentiation or a list multiplication? So if we went that
route, we'd probably have to say:
@a ?:+ @b
@a *:/ @b
Or some such. But if we're going that far in the direction of gobbledygook, perhaps there are prefix characters that wouldn't be so ambiguous. The colon and the dot also have a visual singular/plural value:
@a .+ @b
@a :/ @b
We're already changing the old meaning of dot (and I'm planning to
rescue colon from the ?: operator), so perhaps that could be made
to work. You could almost think of dot and colon as complementary
method calls, where you could say:
$len = @a.length; # length as a scalar operator
@len = @a:length; # length as a list operator
But that would interfere with other desirable uses of colon. Plus, it's actually going to be confusing to think of these as singular and plural operators because, while we're specifying that we want a ``plural'' operator, we're not specifying how to treat the plurality. Consider this:
@len = list:length(@a);
Anyone would naively think that returns the length of the list, not the length of each element of the list. To make it work in English, we'd actually have to say something like this:
@len = each:length(@a);
$len = the:length(@a);
That would be equivalent to the method calls:
@len = @a.each:length;
$len = @a.the:length;
But does this really mean that there are two array methods with those
weird names? I don't think so. We've reached a result here that is
spectacularly close to a reductio ad absurdum. It seems to me that
the whole point of this RFC is that the ``eachness'' is most simply
specified by the list context, together with the knowledge that
length() is a function/method that maps one scalar value to
another. The distribution of that function over an array value is not
something the scalar function should be concerned with, except insofar
as it must make sure its type signature is correct.
And there's the rub. We're really talking about enforced strong typing for this to work right. When we say:
@foo = @bar.mumble
How do we know whether mumble has the type signature that magically
enables iteration over @bar? That definition is off in some other
file that we may not have memorized quite yet. We need some
more explicit syntax that says that auto-interation is expected, regardless
of whether the definition of the operator is well specified. Magical
auto-iteration is not going to work well in a language with optional
typing.
So the resolution of this is that the unmarked forms of operators will force scalar context as they do in Perl 5, and we'll need a special marker that says an operator is to be auto-iterated. That special marker turns out to be an uparrow, with a tip o' the hat to higher-order functions. That is, the hyper-operator:
@a ^* @b
is equivalent to this:
parallel { $^a * $^b } @a, @b
(where parallel is a hypothetical function that iterates through
multiple arrays in parallel.)
Hyper operators will also intuit where a dimension is missing from one of its arguments, and replicate a scalar value to a list value in that dimension. That means you can say:
@a ^+ 1
to get a value with one added to each element of @a. (@a is unchanged.)
I don't believe there are any insurmountable ambiguities with the uparrow
notation. There is currently an uparrow operator meaning exclusive-or,
but that is rarely used in practice, and is not typically followed by
other operators when it is used. We can represent
exclusive-or with ~ instead. (I like that idea anyway, because
the unary ~ is a 1's complement, and the binary ~ would simply
be doing a 1's complement on the second argument of the set bits in
the first argument. On the other hand, there's destructive interference
with other cultural meanings of tilde, so it's not completely obvious that
it's the right thing to do. Nevertheless, that's what we're doing.)
Anyway, in essence, I'm rejecting the underlying premise of this RFC, that we'll have strong enough typing to intuit the right behavior without confusing people. Nevertheless, we'll still have easy-to-use (and more importantly, easy-to-recognize) hyper-operators.
This RFC also asks about how return values for functions like abs() might
be specified. I expect sub declarations to (optionally) include a
return type, so this would be sufficient to figure out which functions
would know how to map a scalar to a scalar. And we should point out again
that even though the base language will not try to intuit which operators
should be hyperoperators, there's no reason in principle that someone
couldn't invent a dialect that does. All is fair if you predeclare.
RFC 045: || and && should propagate result context to both sides
Yes. The thing that makes this work in Perl 6, where it was almost
impossible in Perl 5, is that in Perl 6, list context doesn't imply
immediate list flattening. More precisely, it specifies immediate list
flattening in a notional sense, but the implementation is free to delay
that flattening until it's actually required. Internally, a flattened
list is still an object. So when @a || @b evaluates the arrays,
they're evaluated as objects that can return either a boolean value or
a list, depending on the context. And it will be possible to apply
both contexts to the first argument simultaneously. (Of course, the
computer actually looks at it in the boolean context first.)
There is no conflict with RFC 81 because the hyper versions of these operators will be spelled:
@a ^|| @b
@a ^&& @b


