Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 6
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.

The sub form

A subroutine can be declared as lexically scoped, package scoped, or unscoped:


    rule lexicalsub :w {
        <lexscope> <type>?
        <subintro> <subname> <psignature>?
        <trait>*
        <block>
    }

    rule packagesub :w {
        <subintro> <subname> <psignature>?
        <trait>*
        <block>
    }

    rule anonsub :w {
        <subintro> <psignature>?
        <trait>*
        <block>
    }

The non-lexically scoped declaration cannot specify a return type in front. The return type can only be specified as a trait in that case.

As in Perl 5, the difference between a package sub and an anonymous sub depends on whether you specify the <subname>. If omitted, the declaration (which is not really a declaration in that case) generates and returns a closure. (Which may not really be a closure if it doesn't access any external lexicals, but we call them all closures anyway just in case...)

A lexical subroutine is declared using either my or our:


    rule lexscope { my | our }

This list doesn't include temp or let because those are not declarators of lexical scope but rather operators that initiate dynamic scoping. See the section below on Lvalue subroutines for more about temp and let.

In both lexical and package declarations, the name of the subroutine is introduced by the keyword sub, or one of its variants:


    rule subintro { sub | method | submethod | multi | rule | macro }

A method participates in inheritance and always has an invocant (object or class). A submethod has an invocant but does not participate in inheritance. It's a sub pretending to be a method for the current class only. A multi is a multimethod, that is, a method that called like a subroutine or operator, but is dispatched based on the types of one or more of its arguments.

Another variant is the regex rule, which is really a special kind of method; but in actuality rules probably get their own set of parse rules, since the body of a rule is a regex. I just put "rule" into <subintro> as a placeholder of sorts, because I'm lazy.

A macro is a subroutine that is called immediately upon completion of parsing. It has a default means of parsing arguments, or it may be bound to an alternate grammar rule to parse its arguments however you like.

These syntactic forms correspond the various Routine types in the Code type hierarchy:


                                   Code
                        ____________|________________
                       |                             |
                    Routine                        Block
       ________________|_______________            __|___
      |     |       |       |    |     |          |      |
     Sub Method Submethod Multi Rule Macro      Bare Parametric

The Routine/Block distinction is fairly important, since you always return out of the current Routine, that is, the current Sub, Method, Submethod, Multi, Rule, or Macro. Also, the &_ variable refers to your current Routine. A Block, whether Bare or Parametric, is invisible to both of those notions.

(It's not yet clear whether the Bare vs Parametric distinction is useful. Some apparently Bare blocks are actually Parametric if they refer to $_ internally, even implicitly. And a Bare block is just a Parametric block with a signature of (). More later.)

A <psignature> is a parenthesized signature:


    rule psignature :w { \( <signature> \) }

And there is a variant that doesn't declare names:


    rule psiglet :w { \( <siglet> \) }

(We'll discuss "siglets" later in their own section.)

It's possible to declare a subroutine in an lvalue or a signature as if it were an ordinary variable, in anticipation of binding the symbol to an actual subroutine later. Note this only works with an explicit name, since the whole point of declaring it in the first place is to have a name for it. On the other hand, the formal subroutine's parameters aren't named, hence they are specified by a <psiglet> rather than a <psignature>:


    rule scopedsubvar :w {
        <lexscope> <type>? &<subname> <psiglet>? <trait>*
    }

    rule unscopedsubvar :w {
        &<subname> <psiglet>? <trait>*
    }

If no <psiglet> is supplied for such a declaration, it just uses whatever the signature of the bound routine is. So instead of:


    my sub foo (*@_) { print @_ }

you could equivalently say:


    my &foo ::= sub (*@_) { print @_ };

(You may recall that ::= does binding at compile time. Then again, you may not.)

If there is a <psiglet>, however, it must be compatible with the signature of the routine that is bound to it:


    my &moo(Cow) ::= sub (Horse $x) { $x.neigh };     # ERROR

"Pointy subs"

"Pointy subs" declare a closure with an unparenthesized signature:


    rule pointysub :w {
        -\> <signature> <block>
    }

They may not take traits.

Bare subs

A bare block generates a closure:


    rule baresub :w {
        <block> { .find_placeholders() }
    }

A bare block declaration does not take traits (externally, anyway), and if there are any parameters, they must be specified with placeholder variables. If no placeholders are used, $_ may be treated as a placeholder variable, provided the surrounding control structure passes an argument to the the closure. Otherwise, $_ is bound as an ordinary lexical variable to the outer $_. ($_ is also an ordinary lexical variable when explicit placeholders are used.)

More on parameters below. But before we talk about parameters, we need to talk about types.

Digression on types

Well, what are types, anyway? Though known as a "typeless" language, Perl actually supports several built-in container types such as scalar, array, and hash, as well as user-defined, dynamically typed objects via bless.

Perl 6 will certainly support more types. These include some low-level storage types:


    bit int str num ref bool

as well as some high-level object types:


    Bit Int Str Num Ref Bool
    Array Hash Code IO
    Routine Sub Method Submethod Macro Rule
    Block Bare Parametric
    Package Module Class Object Grammar
    List Lazy Eager

(These lists should not be construed as exhaustive.) We'll also need some way of at least hinting at representations to the compiler, so we may also end up with types like these:


    int8 int16 int32 int64
    uint8 uint16 uint32 uint64

Or maybe those are just extra size traits on a declaration somewhere. That's not important at this point.

The important thing is that we're adding a generalized type system to Perl. Let us begin by admitting that it is the height of madness to add a type system to a language that is well-loved for being typeless.

But mad or not, there are some good reasons to do just that. First, it makes it possible to write interfaces to other languages in Perl. Second, it gives the optimizer more information to think about. Third, it allows the S&M folks to inflict strongly typed compile-time semantics on each other. (Which is fine, as long as they don't inflict those semantics on the rest of us.) Fourth, a type system can be viewed as a pattern matching system for multi-method dispatch.

Which basically boils down to the notion that it's fine for Perl to have a type system as long as it's optional. It's just another area where Perl 6 will try to have its cake and eat it too.

This should not actually come as a surprise to anyone who has been following the development of Perl 5, since the grammatical slot for declaring a variable's effective type has been defined for some time now. In Perl 5 you can say:


    my Cat $felix;

to declare a variable intended to hold a Cat object. That's nice, as far as it goes. Perl 6 will support the same syntax, but we'll have to push it much further than that if we're to have a type system that is good enough to specify interfaces to languages like C++ or Java. In particular, we have to be able to specify the types of composite objects such as arrays and hashes without resorting to class definitions, which are rather heavyweight--not to mention opaque. We need to be able to specify the types of individual function and method parameters and return values. Taken collectively, these parameter types can form the signature of a subroutine, which is one of the traits of the subroutine.

And of course, all this has to be intuitively obvious to the naive user.

Yeah, sure, you say.

Well, let's see how far we can get with it. If the type system is too klunky for some particular use, people will simply avoid using it. Which is fine--that's why it's optional.

First, let's clarify one thing that seems to confuse people frequently. Unlike some languages, Perl makes a distinction between the type of the variable, and the type of the value. In Perl 5, this shows up as the difference between overloading and tying. You overload the value, but you tie the variable. When you say:


    my Cat $felix;

you are specifying the type of the value being stored, not the type of the variable doing the storing. That is, $felix must contain a reference to a Cat value, or something that "isa" Cat. The variable type in this case is just a simple scalar, though that can be changed by tying the variable to some class implementing the scalar variable operations.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Next Pagearrow