Apocalypse 6
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.
Scope of subroutine names
Perl 5 gives subroutine names two scopes. Perl 6 gives them four.
Package scoped subs
All named subs in Perl 5 have package scope. (The body provides a lexical scope, but we're not talking about that. We're talking about where the name of the subroutine is visible from.) Perl 6 provides by default a package-scoped name for "unscoped" declarations such as these:
sub fee {...}
method fie {...}
submethod foe {...}
multi foo {...}
macro sic {...}
Methods and submethods are ordinarily package scoped, because (just as in Perl 5) a class's namespace is kept in a package.
Anonymous subs
It's sort of cheating to call this a subroutine scope, because it's really more of a non-scope. Scope is a property of the name of a subroutine. Since closures and anonymous subs have no name, they naturally have no intrinsic scope of their own. Instead, they rely on the scope of whatever variable contains a reference to them. The only way to get a lexically scoped subroutine name in Perl 5 was by indirection:
my $subref = sub { dostuff(@_) }
&$subref(...)
But that doesn't actually give you a lexically scoped name that is equivalent to an ordinary subroutine's name. Hence, Perl 6 also provides...
Lexically scoped subs
You can declare "scoped" subroutines by explicitly putting a my
or our on the front of the declaration:
my sub privatestuff { ... }
our sub semiprivatestuff { ... }
Both of these introduce a name into the current lexical scope, though
in the case of our this is just an alias for a package subroutine
of the same name. (As with other uses of our, you might want
to introduce a lexical alias if your strictness level prohibits
unqualified access to package subroutines.)
You can also declare lexically scoped macros:
my macro sic { ... }
Global scoped subs
Perl 6 also introduces the notion of completely global variables that
are visible from everywhere they aren't overridden by the current
package or lexical scope. Such variables are named with a leading *
on the identifier, indicating that the package prefix is a wildcard,
if you will. Since subroutines are just a funny kind of variable,
you can also have global subs:
sub *print (*@list) { $*DEFOUT.print(@list) } }
In fact, that's more-or-less how some built-in functions like print
could be implemented in Perl 6. (Methods like $*DEFOUT.print()
are a different story, of course. They're defined off in a
class somewhere. (Unless they're multimethods, in which case they
could be defined almost anywhere, because multimethods are always
globally scoped. (In fact, most built-ins including print will
be multimethods, not subs. (But we're getting ahead of ourselves...))))
Signatures
One of Perl's strong points has always been the blending of positional parameters with variadic parameters.
"Variadic" parameters are the ones that vary. They're the "...And
The Rest" list of values that many functions--like print, map,
and chomp--have at the end of their call. Whereas positional
parameters generally tell a function how to do its job, variadic
parameters are most often used to pass the arbitrary sequences of
data the function is supposed to do its job on/with/to.
In Perl 5, when you unpack the arguments to a sub like so:
my ($a, $b, $c, @rest) = @_;
you are defining three positional parameters, followed by a variadic
list. And if you give the sub a prototype of ($$$@) it will force the
first three parameters to be evaluated in scalar context, while the
remaining arguments are evaluated in list context.
The big problem with the Perl 5 solution is that the parameter binding is done at run time, which has run-time costs. It also means the metadata is not readily available outside the function body. We could just as easily have written it in some other form like:
my $a = shift;
my $b = shift;
my $c = shift;
and left the rest of the arguments in @_. Not only is this
difficult for a compiler to analyze, but it's impossible to get the
metadata from a stub declaration; you have to have the body defined
already.
The old approach is very flexible, but the cost to the user is rather high.
Perl 6 still allows you to access the arguments via @_ if you
like, but in general you'll want to hoist the metadata up into
the declaration. Perl 6 still fully supports the distinction
between positional and variadic data--you just have to declare them
differently. In general, variadic items must follow positional items
both in declaration and in invocation.
In turn, there are at least three kinds of positional parameters, and three kinds of variadic parameters. A declaration for all six kinds of parameter won't win a beauty contest, but might look like this:
method x ($me: $req, ?$opt, +$namedopt, *%named, *@list) {...}
Of course, you'd rarely write all of those in one declaration. Most declarations only use one or two of them. Or three or four... Or five or six...
There is some flexibility in how you pass some of these parameters, but the ordering of both formal parameters and actual arguments is constrained in several ways. For instance, positional parameters must precede non-positional, and required parameters must precede optional. Variadic lists must be attached either to the end of the positional list or the end of the named parameter list. These constraints serve a number of purposes:
- They avoid user confusion.
- They enable the system to implement calls efficiently.
- Perhaps most importantly, they allow interfaces to evolve without breaking old code.
Since there are constraints on the ordering of parameters, similar
parameters tend to clump together into "zones". So we'll call the
?, +, and * symbols you see above "zone markers". The
underlying metaphor really is very much like zoning regulations--you
know, the ones where your city tells you what you may or may not do
on a chunk of land you think you own. Each zone has a set of possible
uses, and similar zones often have overlapping uses. But you're still
in trouble if you put a factory in the middle of a housing division,
just as you're in trouble if you pass a positional argument to a
formal parameter that has no position.
I was originally going to go with a semicolon to separate required from optional parameters (as Perl 5 uses in its prototypes), but I realized that it would get lost in the traffic, visually speaking. It's better to have the zone markers line up, especially if you decide to repeat them in the vertical style:
method action ($self:
int $x,
int ?$y,
int ?$z,
Adverb +$how,
Beneficiary +$for,
Location +$at is copy,
Location +$toward is copy,
Location +$from is copy,
Reason +$why,
*%named,
*@list
) {...}
So optional parameters are all marked with zone markers.
In this section we'll be concentrating on the declaration's syntax rather than the call's syntax, though the two cannot be completely disintertwingled. The declaration syntax is actually the more complicated of the two for various good reasons, so don't get too discouraged just yet.
Positional parameters
The three positional parameter types are the invocant, the required parameters, and the optional positional parameters. (Note that in general, positional parameters may also be called using named parameter notation, but they must be declared as positional parameters if you wish to have the option of calling them as positional parameters.) All positional parameters regardless of their type are considered scalars, and imply scalar context for the actual arguments. If you pass an array or hash to such a parameter, it will actually pass a reference to the array or hash, just as if you'd backslashed the actual argument.
The invocant
The first argument to any method (or submethod) is its invocant, that
is, the object or class upon which the method is acting. The invocant
parameter, if present, is always declared with a colon following it.
The invocant is optional in the sense that, if there's no colon,
there's no explicit invocant declared. It's still there, and it must
be passed by the caller, but it has no name, and merely sets the outer
topic of the method. That is, the invocant's name is $_, at least
until something overrides the current topic. (You can always get at
the invocant with the self built-in, however. If you don't like
"self", you can change it with a macro. See below.)
Ordinary subs never have an invocant. If you want to declare a non-method subroutine that behaves as a method, you should declare a submethod instead.
Multimethods can have multiple invocants. A colon terminates the list
of invocants, so if there is no colon, all parameters are considered
invocants. Only invocants participate in multimethod dispatch.
Only the first invocant is bound to $_.
Macros are considered methods on the current parse state object, so they have an invocant.
Required parameters
Next (or first in the case of subs) come the required positional parameters. If, for instance, the routine declares three of these, you have to pass at least three arguments in the same order. The list of required parameters is terminated at the first optional parameter, that is the first parameter having any kind of zone marker. If none of those are found, all the parameters are required, and if you pass either too many or too few arguments, Perl will throw an exception as soon as it notices. (That might be at either compile time or run time.) If there are optional or variadic parameters, the required list merely serves as the minimum number of arguments you're allowed to pass.
Optional parameters
Next come the optional positional parameters. (They have to come next
because they're positional.) In the declaration, optional positional
parameters are distinguished from required parameters by marking the
optional parameters with a question mark. (The parameters are not
distinguished in the call--you just use commas. We'll discuss call
syntax later.) All optional positional parameters are marked with
?, not just the first one. Once you've made the transition to the
optional parameter zone, all parameters are considered optional from
there to the end of the signature, even after you switch zones to
+ or *. But once you leave the positional zone (at the end
of the ? zone), you can't switch back to the positional zone,
because positionals may not follow variadics.
If there are no variadic parameters following the optional parameters, the declaration establishes both a minimum and a maximum number of allowed arguments. And again, Perl will complain when it notices you violating either constraint. So the declaration:
sub *substr ($string, ?$offset, ?$length, ?$repl) {...}
says that substr can be called with anywhere from 1 to 4 scalar
parameters.
Variadic parameters
Following the positional parameters, three kinds of variadic parameters
may be declared. Variadic arguments may be slurped into a hash or
an array depending on whether they look like named arguments or not.
"Slurpy" parameters are denoted by a unary * before the variable
name, which indicates that an arbitrary number of values is expected
for that variable.
Additional named parameters may be placed at the end of the
declaration, or marked with a unary + (because they're "extra"
parameters). Since they are--by definition--in the variadic region,
they may only be passed as named arguments, never positionally. It is
illegal to mark a parameter with ? after the first + or *,
because you can't reenter a positional zone from a variadic zone.
Unlike the positional parameters, the variadic parameters are not necessarily declared in the same order as they will be passed in the call. They may be declared in any order (though the exact behavior of a slurpy array depends slightly on whether you declare it first or last).
Named-only parameters
Parameters marked with a + zone marker are named-only parameters.
Such a parameter may never be passed positionally, but only by name.
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 |

