Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 6
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.

Scope of subroutine names

Perl 5 gives subroutine names two scopes. Perl 6 gives them four.

Package scoped subs

All named subs in Perl 5 have package scope. (The body provides a lexical scope, but we're not talking about that. We're talking about where the name of the subroutine is visible from.) Perl 6 provides by default a package-scoped name for "unscoped" declarations such as these:


          sub fee {...}
       method fie {...}
    submethod foe {...}
        multi foo {...}
        macro sic {...}

Methods and submethods are ordinarily package scoped, because (just as in Perl 5) a class's namespace is kept in a package.

Anonymous subs

It's sort of cheating to call this a subroutine scope, because it's really more of a non-scope. Scope is a property of the name of a subroutine. Since closures and anonymous subs have no name, they naturally have no intrinsic scope of their own. Instead, they rely on the scope of whatever variable contains a reference to them. The only way to get a lexically scoped subroutine name in Perl 5 was by indirection:


    my $subref = sub { dostuff(@_) }
    &$subref(...)

But that doesn't actually give you a lexically scoped name that is equivalent to an ordinary subroutine's name. Hence, Perl 6 also provides...

Lexically scoped subs

You can declare "scoped" subroutines by explicitly putting a my or our on the front of the declaration:


    my sub privatestuff { ... }
    our sub semiprivatestuff { ... }

Both of these introduce a name into the current lexical scope, though in the case of our this is just an alias for a package subroutine of the same name. (As with other uses of our, you might want to introduce a lexical alias if your strictness level prohibits unqualified access to package subroutines.)

You can also declare lexically scoped macros:


    my macro sic { ... }

Global scoped subs

Perl 6 also introduces the notion of completely global variables that are visible from everywhere they aren't overridden by the current package or lexical scope. Such variables are named with a leading * on the identifier, indicating that the package prefix is a wildcard, if you will. Since subroutines are just a funny kind of variable, you can also have global subs:


    sub *print (*@list) { $*DEFOUT.print(@list) } }

In fact, that's more-or-less how some built-in functions like print could be implemented in Perl 6. (Methods like $*DEFOUT.print() are a different story, of course. They're defined off in a class somewhere. (Unless they're multimethods, in which case they could be defined almost anywhere, because multimethods are always globally scoped. (In fact, most built-ins including print will be multimethods, not subs. (But we're getting ahead of ourselves...))))

Signatures

One of Perl's strong points has always been the blending of positional parameters with variadic parameters.

"Variadic" parameters are the ones that vary. They're the "...And The Rest" list of values that many functions--like print, map, and chomp--have at the end of their call. Whereas positional parameters generally tell a function how to do its job, variadic parameters are most often used to pass the arbitrary sequences of data the function is supposed to do its job on/with/to.

In Perl 5, when you unpack the arguments to a sub like so:


    my ($a, $b, $c, @rest) = @_;

you are defining three positional parameters, followed by a variadic list. And if you give the sub a prototype of ($$$@) it will force the first three parameters to be evaluated in scalar context, while the remaining arguments are evaluated in list context.

The big problem with the Perl 5 solution is that the parameter binding is done at run time, which has run-time costs. It also means the metadata is not readily available outside the function body. We could just as easily have written it in some other form like:


    my $a = shift;
    my $b = shift;
    my $c = shift;

and left the rest of the arguments in @_. Not only is this difficult for a compiler to analyze, but it's impossible to get the metadata from a stub declaration; you have to have the body defined already.

The old approach is very flexible, but the cost to the user is rather high.

Perl 6 still allows you to access the arguments via @_ if you like, but in general you'll want to hoist the metadata up into the declaration. Perl 6 still fully supports the distinction between positional and variadic data--you just have to declare them differently. In general, variadic items must follow positional items both in declaration and in invocation.

In turn, there are at least three kinds of positional parameters, and three kinds of variadic parameters. A declaration for all six kinds of parameter won't win a beauty contest, but might look like this:


    method x ($me: $req, ?$opt, +$namedopt, *%named, *@list) {...}

Of course, you'd rarely write all of those in one declaration. Most declarations only use one or two of them. Or three or four... Or five or six...

There is some flexibility in how you pass some of these parameters, but the ordering of both formal parameters and actual arguments is constrained in several ways. For instance, positional parameters must precede non-positional, and required parameters must precede optional. Variadic lists must be attached either to the end of the positional list or the end of the named parameter list. These constraints serve a number of purposes:

  • They avoid user confusion.
  • They enable the system to implement calls efficiently.
  • Perhaps most importantly, they allow interfaces to evolve without breaking old code.

Since there are constraints on the ordering of parameters, similar parameters tend to clump together into "zones". So we'll call the ?, +, and * symbols you see above "zone markers". The underlying metaphor really is very much like zoning regulations--you know, the ones where your city tells you what you may or may not do on a chunk of land you think you own. Each zone has a set of possible uses, and similar zones often have overlapping uses. But you're still in trouble if you put a factory in the middle of a housing division, just as you're in trouble if you pass a positional argument to a formal parameter that has no position.

I was originally going to go with a semicolon to separate required from optional parameters (as Perl 5 uses in its prototypes), but I realized that it would get lost in the traffic, visually speaking. It's better to have the zone markers line up, especially if you decide to repeat them in the vertical style:


    method action ($self:
                int  $x,
                int ?$y,
                int ?$z,
             Adverb +$how,
        Beneficiary +$for,
           Location +$at is copy,
           Location +$toward is copy,
           Location +$from is copy,
             Reason +$why,
                    *%named,
                    *@list
                ) {...}

So optional parameters are all marked with zone markers.

In this section we'll be concentrating on the declaration's syntax rather than the call's syntax, though the two cannot be completely disintertwingled. The declaration syntax is actually the more complicated of the two for various good reasons, so don't get too discouraged just yet.

Positional parameters

The three positional parameter types are the invocant, the required parameters, and the optional positional parameters. (Note that in general, positional parameters may also be called using named parameter notation, but they must be declared as positional parameters if you wish to have the option of calling them as positional parameters.) All positional parameters regardless of their type are considered scalars, and imply scalar context for the actual arguments. If you pass an array or hash to such a parameter, it will actually pass a reference to the array or hash, just as if you'd backslashed the actual argument.

The invocant

The first argument to any method (or submethod) is its invocant, that is, the object or class upon which the method is acting. The invocant parameter, if present, is always declared with a colon following it. The invocant is optional in the sense that, if there's no colon, there's no explicit invocant declared. It's still there, and it must be passed by the caller, but it has no name, and merely sets the outer topic of the method. That is, the invocant's name is $_, at least until something overrides the current topic. (You can always get at the invocant with the self built-in, however. If you don't like "self", you can change it with a macro. See below.)

Ordinary subs never have an invocant. If you want to declare a non-method subroutine that behaves as a method, you should declare a submethod instead.

Multimethods can have multiple invocants. A colon terminates the list of invocants, so if there is no colon, all parameters are considered invocants. Only invocants participate in multimethod dispatch. Only the first invocant is bound to $_.

Macros are considered methods on the current parse state object, so they have an invocant.

Required parameters

Next (or first in the case of subs) come the required positional parameters. If, for instance, the routine declares three of these, you have to pass at least three arguments in the same order. The list of required parameters is terminated at the first optional parameter, that is the first parameter having any kind of zone marker. If none of those are found, all the parameters are required, and if you pass either too many or too few arguments, Perl will throw an exception as soon as it notices. (That might be at either compile time or run time.) If there are optional or variadic parameters, the required list merely serves as the minimum number of arguments you're allowed to pass.

Optional parameters

Next come the optional positional parameters. (They have to come next because they're positional.) In the declaration, optional positional parameters are distinguished from required parameters by marking the optional parameters with a question mark. (The parameters are not distinguished in the call--you just use commas. We'll discuss call syntax later.) All optional positional parameters are marked with ?, not just the first one. Once you've made the transition to the optional parameter zone, all parameters are considered optional from there to the end of the signature, even after you switch zones to + or *. But once you leave the positional zone (at the end of the ? zone), you can't switch back to the positional zone, because positionals may not follow variadics.

If there are no variadic parameters following the optional parameters, the declaration establishes both a minimum and a maximum number of allowed arguments. And again, Perl will complain when it notices you violating either constraint. So the declaration:


    sub *substr ($string, ?$offset, ?$length, ?$repl) {...}

says that substr can be called with anywhere from 1 to 4 scalar parameters.

Variadic parameters

Following the positional parameters, three kinds of variadic parameters may be declared. Variadic arguments may be slurped into a hash or an array depending on whether they look like named arguments or not. "Slurpy" parameters are denoted by a unary * before the variable name, which indicates that an arbitrary number of values is expected for that variable.

Additional named parameters may be placed at the end of the declaration, or marked with a unary + (because they're "extra" parameters). Since they are--by definition--in the variadic region, they may only be passed as named arguments, never positionally. It is illegal to mark a parameter with ? after the first + or *, because you can't reenter a positional zone from a variadic zone.

Unlike the positional parameters, the variadic parameters are not necessarily declared in the same order as they will be passed in the call. They may be declared in any order (though the exact behavior of a slurpy array depends slightly on whether you declare it first or last).

Named-only parameters

Parameters marked with a + zone marker are named-only parameters. Such a parameter may never be passed positionally, but only by name.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Next Pagearrow