Apocalypse 3
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 03 for the latest information.
Non-RFC considerations
The RFCs propose various specific features, but don't give a systematic view of the operators as a whole. In this section I'll try to give a more cohesive picture of where I see things going.
Binary . (dot)
This is now the method call operator, in line with industry-wide practice. It also has ramifications for how we declare object attribute variables. I'm anticipating that, within a class module, saying
my int $.counter;
would declare both a $.counter instance variable and a counter
accessor method for use within the class. (If marked as public, it
would also declare a counter accessor method for use outside the
class.)
Unary . (dot)
It's possible that a unary . would call a method on the current object
within a class. That is, it would be the same as a binary . with
$self (or equivalent) on the left:
method foowrapper ($a, $b) {
.reallyfoo($a, $b, $c)
}
On the other hand, it might be considered better style to be explicit:
method foowrapper ($self: $a, $b) {
$self.reallyfoo($a, $b, $c)
}
(Don't take that declaration syntax as final just yet, however.)
Binary _
Since . is taken for method calls, we need a new way to concatenate
strings. We'll use a solitary underscore for that. So, instead of:
$a . $b . $c
you'll say:
$a _ $b _ $c
The only downside to that is the space between a variable name and the operator is required. This is to be construed as a feature.
Unary _
Since the _ token indicating stat buffer is going away, a unary
underscore operator will force stringification, just
as interpolation does, only without the quotes.
Unary +
Similarly, a unary + will force numification in Perl 6, unlike
in Perl 5. If that fails, NaN (not a number) is returned.
Binary :=
We need to distinguish two different forms of assignment. The standard
assignment operator, =, works just as it does Perl 5, as much as
possible. That is, it tries to make it look like a value assignment.
This is our cultural heritage.
|
|
But we also need an operator that works like assignment but is more
definitional. If you're familiar with Prolog, you can think of it as a
sort of unification operator (though without the implicit backtracking
semantics). In human terms, it treats the left side as a set of formal
arguments exactly as if they were in the declaration of a function, and
binds a set of arguments on the right hand side as though they were
being passed to a function. This is what the new := operator does.
More below.
Unary *
Unary * is the list flattening operator. (See Ruby
for prior art.) When used on an rvalue, it turns off function signature
matching for the rest of the arguments, so that, for instance:
@args = (\@foo, @bar);
push *@args;
would be equivalent to:
push @foo, @bar;
In this respect, it serves as a replacement for the prototype-disabling
&foo(@bar) syntax of Perl 5. That would be translated to:
foo(*@bar)
In an lvalue, the unary * indicates that subsequent array names
slurp all the rest of the values. So this would swap two arrays:
(@a, @b) := (@b, @a);
whereas this would assign all the array elements of @c and @d to @a.
(*@a, @b) := (@c, @d);
An ordinary flattening list assignment:
@a = (@b, @c);
is equivalent to:
*@a := (@b, @c);
That's not the same as
@a := *(@b, @c);
which would take the first element of @b as the new definition of
@a, and throw away the rest, exactly as if you passed too many
arguments to a function. It could optionally be made to blow up
at run time. (It can't be made to blow up at compile time, since we
don't know how many elements are in @b and @c combined. There
could be exactly one element, which is what the left side wants.)
List context
The whole notion of list context is somewhat modified in Perl 6.
Since lists can be lazy, the interpretation of list flattening is
also by necessity lazy. This means that, in the absence of the * list
flattening operator (or an equivalent old-fashioned list assignment),
lists in Perl 6 are object lists. That is to say, they are parsed
as if they were a list of objects in scalar context. When you see
a function call like:
foo @a, @b, @c;
you should generally assume that three discrete arrays are being passed
to the function, unless you happen to know that the signature of foo
includes a list flattening *. (If a subroutine doesn't have a
signature, it is assumed to have a signature of (*@_) for old
times' sake.) Note that this is really nothing new to Perl, which has always
made this distinction for builtins, and extended it to user-defined
functions in Perl 5 via prototypes like \@ and \%. We're just changing
the syntax in Perl 6 so that the unmarked form of formal argument expects
a scalar value, and you optionally declare the final formal argument
to expect a list. It's a matter of Huffman coding again, not to mention
saving wear and tear on the backslash key.
Binary :
As I pointed out in an earlier apocalypse, the first rule of computer language design is that everybody wants the colon. I think that means that we should do our best to give the colon to as many features as possible.
Hence, this operator modifies a preceding operator adverbially. That is, it can turn any operator into a trinary operator (provided a suitable definition is declared). It can be used to supply a ``step'' to a range operator, for instance. It can also be used as a kind of super-comma separating an indirect object from the subsequent argument list:
print $handle[2]: @args;
Of course, this conflicts with the old definition of the ?: operator. See
below.
In a method type signature, this operator indicates that a previous argument (or arguments) is to be considered the ``self'' of a method call. (Putting it after multiple arguments could indicate a desire for multimethod dispatch!)
Trinary ??::
The old ?: operator is now spelled ??::. That is to say, since it's
really a kind of short-circuit operator, we just double both characters
like the && and || operator. This makes it easy to remember for
C programmers. Just change:
$a ? $b : $c
to
$a ?? $b :: $c
The basic problem is that the old ?: operator wastes two very useful
single characters for an operator that is not used often enough to
justify the waste of two characters. It's bad Huffman coding, in
other words. Every proposed use of colon in the RFCs conflicted with
the ?: operator. I think that says something.
I can't list here all the possible spellings of ?: that I considered.
I just think ??:: is the most visually appealing and mnemonic of the
lot of them.
Binary //
A binary // operator is the defaulting operator. That is:
$a // $b
is short for:
defined($a) ?? $a :: $b
except that the left side is evaluated only once. It will work on arrays and hashes as well as scalars. It also has a corresponding assignment operator, which only does the assignment if the left side is undefined:
$pi //= 3;
Binary ;
The binary ; operator separates two expressions in a list, much like
the expressions within a C-style for loop. Obviously the
expressions need to be in some kind of bracketing structure to avoid
ambiguity with the end of the statement. Depending on the context,
these expressions may be interpreted as arguments to a for loop, or
slices of a multi-dimensional array, or whatever. In the absence of
other context, the default is simply to make a list of lists. That is,
[1,2,3;4,5,6]
is a shorthand for:
[[1,2,3],[4,5,6]]
But usually there will be other context, such as a multidimension array
that wants to be sliced, or a syntactic construct that wants to emulate
some kind of control structure. A construct emulating a 3-argument
for loop might force all the expressions to be closures, for
instance, so that they can be evaluated each time through the loop.
User-defined syntax will discussed in apocalypse 18, if not sooner.
Unary ^
Unary ^ is now reserved for hyper operators. Note that it works on assignment operators as well:
@a ^+= 1; # increment all elements of @a
Unary ?
Reserved for future use.
Binary ?
Reserved for future use.
Binary ~
This is now the bitwise XOR operator. Recall that unary
~ (1's complement) is simply an XOR with a value containing all 1
bits.
Binary ~~
This is a logical XOR operator. It's a high precedence version of the
low precedence xor operator.
User defined operators
The declaration syntax of user-defined operators is still up for grabs,
but we can say a few things about it. First, we can differentiate
unary from binary declarations simply by the number of arguments.
(Declaration of a return type may also be useful for disambiguating
subsequent parsing. One place it won't be needed is for operators wanting
to know whether they should behave as hyperoperators. The pressure to
do that is relieved by the explicit ^ hypermarker.)
We also need to think how these operator definitions relate to overloading. We can treat an operator as a method on the first object, but sometimes it's the second object that should control the action. (Or with multimethod dispatch, both objects.) These will have to be thrashed out under ordinary method dispatch policy. The important thing is to realize that an operator is just a funny looking method call. When you say:
$man bites $dog
The infrastruture will need to untangle whether the man is biting the
dog or the dog is getting bitten by the man. The actual biting could
be implement in either the Man class or the Dog class, or even
somewhere else, in the case of multimethods.
Unicode operators
Rather than using longer and longer strings of ASCII characters to represent user-defined operators, it will be much more readable to allow the (judicious) use of Unicode operators.
In the short term, we won't see much of this. As screen resolutions increase over the next 20 years, we'll all become much more comfortable with the richer symbol set. I see no reason (other than fear of obfuscation (and fear of fear of obfuscation))) why Unicode operators should not be allowed.
Note that, unlike APL, we won't be hardware dependent, in the sense that any Perl implementation will always be able to parse Unicode, even if you can't display it very well. (But note that Vim 6.0 just came out with Unicode support.)
Precedence
We will at least unify the precedence levels of the equality and
relational operators. Other unifications are possible. For instance,
the not logical operator could be combined with list operators in
precedence. There's only so much simplification that you can do,
however, since you can't mix right association with left association.
By and large, the precedence table will be what you expect, if you
expect it to remain largely the same.
And that still goes for Perl 6 in general. We talk a lot here about what we're changing, but there's a lot more that we're not changing. Perl 5 does a lot of things right, and we're not terribly interested in ``fixing'' that.

