Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 3
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 03 for the latest information.

Non-RFC considerations

The RFCs propose various specific features, but don't give a systematic view of the operators as a whole. In this section I'll try to give a more cohesive picture of where I see things going.

Binary . (dot)

This is now the method call operator, in line with industry-wide practice. It also has ramifications for how we declare object attribute variables. I'm anticipating that, within a class module, saying

    my int $.counter;

would declare both a $.counter instance variable and a counter accessor method for use within the class. (If marked as public, it would also declare a counter accessor method for use outside the class.)

Unary . (dot)

It's possible that a unary . would call a method on the current object within a class. That is, it would be the same as a binary . with $self (or equivalent) on the left:

    method foowrapper ($a, $b) {
        .reallyfoo($a, $b, $c)
    }

On the other hand, it might be considered better style to be explicit:

    method foowrapper ($self: $a, $b) {
        $self.reallyfoo($a, $b, $c)
    }

(Don't take that declaration syntax as final just yet, however.)

Binary _

Since . is taken for method calls, we need a new way to concatenate strings. We'll use a solitary underscore for that. So, instead of:

    $a . $b . $c

you'll say:

    $a _ $b _ $c

The only downside to that is the space between a variable name and the operator is required. This is to be construed as a feature.

Unary _

Since the _ token indicating stat buffer is going away, a unary underscore operator will force stringification, just as interpolation does, only without the quotes.

Unary +

Similarly, a unary + will force numification in Perl 6, unlike in Perl 5. If that fails, NaN (not a number) is returned.

Binary :=

We need to distinguish two different forms of assignment. The standard assignment operator, =, works just as it does Perl 5, as much as possible. That is, it tries to make it look like a value assignment. This is our cultural heritage.

Mastering Regular Expressions 
Mastering Regular Expressions
By Jeffrey E. F. Friedl
1st Edition January 1997
1-56592-257-3, Order Number: 2573
366 pages, $34.95

But we also need an operator that works like assignment but is more definitional. If you're familiar with Prolog, you can think of it as a sort of unification operator (though without the implicit backtracking semantics). In human terms, it treats the left side as a set of formal arguments exactly as if they were in the declaration of a function, and binds a set of arguments on the right hand side as though they were being passed to a function. This is what the new := operator does. More below.

Unary *

Unary * is the list flattening operator. (See Ruby for prior art.) When used on an rvalue, it turns off function signature matching for the rest of the arguments, so that, for instance:

    @args = (\@foo, @bar);
    push *@args;

would be equivalent to:

    push @foo, @bar;

In this respect, it serves as a replacement for the prototype-disabling &foo(@bar) syntax of Perl 5. That would be translated to:

    foo(*@bar)

In an lvalue, the unary * indicates that subsequent array names slurp all the rest of the values. So this would swap two arrays:

    (@a, @b) := (@b, @a);

whereas this would assign all the array elements of @c and @d to @a.

    (*@a, @b) := (@c, @d);

An ordinary flattening list assignment:

    @a = (@b, @c);

is equivalent to:

    *@a := (@b, @c);

That's not the same as

    @a := *(@b, @c);

which would take the first element of @b as the new definition of @a, and throw away the rest, exactly as if you passed too many arguments to a function. It could optionally be made to blow up at run time. (It can't be made to blow up at compile time, since we don't know how many elements are in @b and @c combined. There could be exactly one element, which is what the left side wants.)

List context

The whole notion of list context is somewhat modified in Perl 6. Since lists can be lazy, the interpretation of list flattening is also by necessity lazy. This means that, in the absence of the * list flattening operator (or an equivalent old-fashioned list assignment), lists in Perl 6 are object lists. That is to say, they are parsed as if they were a list of objects in scalar context. When you see a function call like:

    foo @a, @b, @c;

you should generally assume that three discrete arrays are being passed to the function, unless you happen to know that the signature of foo includes a list flattening *. (If a subroutine doesn't have a signature, it is assumed to have a signature of (*@_) for old times' sake.) Note that this is really nothing new to Perl, which has always made this distinction for builtins, and extended it to user-defined functions in Perl 5 via prototypes like \@ and \%. We're just changing the syntax in Perl 6 so that the unmarked form of formal argument expects a scalar value, and you optionally declare the final formal argument to expect a list. It's a matter of Huffman coding again, not to mention saving wear and tear on the backslash key.

Binary :

As I pointed out in an earlier apocalypse, the first rule of computer language design is that everybody wants the colon. I think that means that we should do our best to give the colon to as many features as possible.

Hence, this operator modifies a preceding operator adverbially. That is, it can turn any operator into a trinary operator (provided a suitable definition is declared). It can be used to supply a ``step'' to a range operator, for instance. It can also be used as a kind of super-comma separating an indirect object from the subsequent argument list:

    print $handle[2]: @args;

Of course, this conflicts with the old definition of the ?: operator. See below.

In a method type signature, this operator indicates that a previous argument (or arguments) is to be considered the ``self'' of a method call. (Putting it after multiple arguments could indicate a desire for multimethod dispatch!)

Trinary ??::

The old ?: operator is now spelled ??::. That is to say, since it's really a kind of short-circuit operator, we just double both characters like the && and || operator. This makes it easy to remember for C programmers. Just change:

    $a ? $b : $c

to

    $a ?? $b :: $c

The basic problem is that the old ?: operator wastes two very useful single characters for an operator that is not used often enough to justify the waste of two characters. It's bad Huffman coding, in other words. Every proposed use of colon in the RFCs conflicted with the ?: operator. I think that says something.

I can't list here all the possible spellings of ?: that I considered. I just think ??:: is the most visually appealing and mnemonic of the lot of them.

Binary //

A binary // operator is the defaulting operator. That is:

    $a // $b

is short for:

    defined($a) ?? $a :: $b

except that the left side is evaluated only once. It will work on arrays and hashes as well as scalars. It also has a corresponding assignment operator, which only does the assignment if the left side is undefined:

    $pi //= 3;

Binary ;

The binary ; operator separates two expressions in a list, much like the expressions within a C-style for loop. Obviously the expressions need to be in some kind of bracketing structure to avoid ambiguity with the end of the statement. Depending on the context, these expressions may be interpreted as arguments to a for loop, or slices of a multi-dimensional array, or whatever. In the absence of other context, the default is simply to make a list of lists. That is,

    [1,2,3;4,5,6]

is a shorthand for:

    [[1,2,3],[4,5,6]]

But usually there will be other context, such as a multidimension array that wants to be sliced, or a syntactic construct that wants to emulate some kind of control structure. A construct emulating a 3-argument for loop might force all the expressions to be closures, for instance, so that they can be evaluated each time through the loop. User-defined syntax will discussed in apocalypse 18, if not sooner.

Unary ^

Unary ^ is now reserved for hyper operators. Note that it works on assignment operators as well:

    @a ^+= 1;    # increment all elements of @a

Unary ?

Reserved for future use.

Binary ?

Reserved for future use.

Binary ~

This is now the bitwise XOR operator. Recall that unary ~ (1's complement) is simply an XOR with a value containing all 1 bits.

Binary ~~

This is a logical XOR operator. It's a high precedence version of the low precedence xor operator.

User defined operators

The declaration syntax of user-defined operators is still up for grabs, but we can say a few things about it. First, we can differentiate unary from binary declarations simply by the number of arguments. (Declaration of a return type may also be useful for disambiguating subsequent parsing. One place it won't be needed is for operators wanting to know whether they should behave as hyperoperators. The pressure to do that is relieved by the explicit ^ hypermarker.)

We also need to think how these operator definitions relate to overloading. We can treat an operator as a method on the first object, but sometimes it's the second object that should control the action. (Or with multimethod dispatch, both objects.) These will have to be thrashed out under ordinary method dispatch policy. The important thing is to realize that an operator is just a funny looking method call. When you say:

    $man bites $dog

The infrastruture will need to untangle whether the man is biting the dog or the dog is getting bitten by the man. The actual biting could be implement in either the Man class or the Dog class, or even somewhere else, in the case of multimethods.

Unicode operators

Rather than using longer and longer strings of ASCII characters to represent user-defined operators, it will be much more readable to allow the (judicious) use of Unicode operators.

In the short term, we won't see much of this. As screen resolutions increase over the next 20 years, we'll all become much more comfortable with the richer symbol set. I see no reason (other than fear of obfuscation (and fear of fear of obfuscation))) why Unicode operators should not be allowed.

Note that, unlike APL, we won't be hardware dependent, in the sense that any Perl implementation will always be able to parse Unicode, even if you can't display it very well. (But note that Vim 6.0 just came out with Unicode support.)

Precedence

We will at least unify the precedence levels of the equality and relational operators. Other unifications are possible. For instance, the not logical operator could be combined with list operators in precedence. There's only so much simplification that you can do, however, since you can't mix right association with left association. By and large, the precedence table will be what you expect, if you expect it to remain largely the same.

And that still goes for Perl 6 in general. We talk a lot here about what we're changing, but there's a lot more that we're not changing. Perl 5 does a lot of things right, and we're not terribly interested in ``fixing'' that.