Apocalypse 6
Subroutines
by Larry WallMarch 07, 2003
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.
This is the Apocalypse on Subroutines. In Perl culture the term
"subroutine" conveys the general notion of calling something that
returns control automatically when it's done. This "something" that
you're calling may go by a more specialized name such as "procedure",
"function", "closure", or "method". In Perl 5, all such subroutines
were declared using the keyword sub regardless of their specialty.
For readability, Perl 6 will use alternate keywords to declare special
subroutines, but they're still essentially the same thing underneath.
Insofar as they all behave similarly, this Apocalypse will have
something to say about them. (And if we also leak a few secrets
about how method calls work, that will make Apocalypse 12 all the
easier--presuming we don't have to un-invent anything between now
and then...)
Here are the RFCs covered in this Apocalypse. PSA stands for "problem, solution, acceptance", my private rating of how this RFC will fit into Perl 6. I note that none of the RFCs achieved unreserved acceptance this time around. Maybe I'm getting picky in my old age. Or maybe I just can't incorporate anything into Perl without "marking" it...
RFC PSA Title
--- --- -----
21 abc Subroutines: Replace C<wantarray> with a generic C<want>
function
23 bcc Higher order functions
57 abb Subroutine prototypes and parameters
59 bcr Proposal to utilize C<*> as the prefix to magic subroutines
75 dcr structures and interface definitions
107 adr lvalue subs should receive the rvalue as an argument
118 rrr lvalue subs: parameters, explicit assignment, and wantarray()
changes
128 acc Subroutines: Extend subroutine contexts to include name
parameters and lazy arguments
132 acr Subroutines should be able to return an lvalue
149 adr Lvalue subroutines: implicit and explicit assignment
154 bdr Simple assignment lvalue subs should be on by default
160 acc Function-call named parameters (with compiler optimizations)
168 abb Built-in functions should be functions
176 bbb subroutine / generic entity documentation
194 acc Standardise Function Pre- and Post-Handling
271 abc Subroutines : Pre- and post- handlers for subroutines
298 cbc Make subroutines' prototypes accessible from Perl
334 abb Perl should allow specially attributed subs to be called as C
functions
344 acb Elements of @_ should be read-only by default
In Apocalypses 1 through 4, I used the RFCs as a springboard for discussion. In Apocalypse 5 I was forced by the complexity of the redesign to switch strategies and present the RFCs after a discussion of all the issues involved. That was so well received that I'll try to follow the same approach with this and subsequent Apocalypses.
But this Apocalypse is not trying to be as radical as the one on regexes. Well, okay, it is, and it isn't. Alright, it is radical, but you'll like it anyway (we hope). At least the old way of calling subroutines still works. Unlike regexes, Perl subroutines don't have a lot of historical cruft to get rid of. In fact, the basic problem with Perl 5's subroutines is that they're not crufty enough, so the cruft leaks out into user-defined code instead, by the Conservation of Cruft Principle. Perl 6 will let you migrate the cruft out of the user-defined code and back into the declarations where it belongs. Then you will think it to be very beautiful cruft indeed (we hope).
Perl 5's subroutines have a number of issues that need to be dealt with. First of all, they're just awfully slow, for various reasons:
-
Construction of the
@_array - Needless prepping of potential lvalues
- General model that forces lots of run-time processing
- Difficulty of optimization
- Storage of unneeded context
- Lack of tail recursion optimization
- Named params that aren't really
- Object model that forces double dispatch in some cases
Quite apart from performance, however, there are a number of problems with usability:
- Not easy to detect type errors at compile time
- Not possible to specify the signatures of certain built-in functions
- Not possible to define control structures as subroutines
- Not possible to type-check any variadic args other than as a list
- Not possible to have a variadic list providing scalar context to its elements
- Not possible to have lazy parameters
- Not possible to define immediate subroutines (macros)
- Not possible to define subroutines with special syntax
- Not enough contextual information available at run time.
- Not enough contextual information available at compile time.
In general, the consensus is that Perl 5's simple subroutine syntax is just a little too simple. Well, okay, it's a lot too simple. While it's extremely orthogonal to always pass all arguments as a single variadic array, that mechanism does not always map well onto the problem space. So in Perl 6, subroutine syntax has blossomed in several directions.
But the most important thing to note is that we haven't actually added a lot of syntax. We've added some, but most of new capabilities come in through the generalized trait/property system, and the new type system. But in those cases where specialized syntax buys us clarity, we have not hesitated to add it. (Er, actually, we hesitated quite a lot. Months, in fact.)
One obvious difference is that the sub on closures is now optional,
since every brace-delimited block is now essentially a closure.
You can still put the sub if you like. But it is only required
if the block would otherwise be construed as a hash value; that is,
if it appears to contain a list of pairs. You can force any block to
be considered a subroutine with the sub keyword; likewise you can
force any block to be considered a hash value with the hash keyword.
But in general Perl just dwims based on whether the top-level is a
list that happens to have a first argument that is a pair or hash:
Block Meaning
----- -------
{ 1 => 2 } hash { 1 => 2 }
{ 1 => 2, 3 => 4 } hash { 1 => 2, 3 => 4 }
{ 1 => 2, 3, 4 } hash { 1 => 2, 3 => 4 }
{ %foo, 1 => 2 } hash { %foo.pairs, 1 => 2 }
Anything else that is not a list, or does not start with a pair or hash, indicates a subroutine:
{ 1 } sub { return 1 }
{ 1, 2 } sub { return 1, 2 }
{ 1, 2, 3 } sub { return 1, 2, 3 }
{ 1, 2, 3 => 4 } sub { return 1, 2, 3 => 4 }
{ pair 1,2,3,4 } sub { return 1 => 2, 3 => 4 }
{ gethash() } sub { return gethash() }
This is a syntactic distinction, not a semantic one. That last two
examples are taken to be subs despite containing functions returning
pairs or hashes. Note that it would save no typing to recognize the
pair method specially, since hash automatically does pairing
of non-pairs. So we distinguish these:
{ pair 1,2,3,4 } sub { return 1 => 2, 3 => 4 }
hash { 1,2,3,4 } hash { 1 => 2, 3 => 4 }
If you're worried about the compiler making bad choices before deciding
whether it's a subroutine or hash, you shouldn't. The two constructs
really aren't all that far apart. The hash keyword could in fact
be considered a function that takes as its first argument a closure
returning a hash value list. So the compiler might just compile the
block as a closure in either case, then do the obvious optimization.
Although we say the sub keyword is now optional on a closure, the
return keyword only works with an explicit sub. (There are
other ways to return values from a block.)
Subroutine Declarations
You may still declare a sub just as you did in Perl 5, in which case
it behaves much like it did in Perl 5. To wit, the arguments still
come in via the @_ array. When you say:
sub foo { print @_ }
that is just syntactic sugar for this:
sub foo (*@_) { print @_ }
That is, Perl 6 will supply a default parameter signature (the precise
meaning of which will be explained below) that makes the subroutine
behave much as a Perl 5 programmer would expect, with all the arguments
in @_. It is not exactly the same, however. You may not modify
the arguments via @_ without declaring explicitly that you want
to do so. So in the rare cases that you want to do that, you'll have
to supply the rw trait (meaning the arguments should be considered
"read-write"):
sub swap (*@_ is rw) { @_[0,1] = @_[1,0] };
The Perl5-to-Perl6 translator will try to catch those cases and add the parameter signature for you when you want to modify the arguments. (Note: we will try to be consistent about using "arguments" to mean the actual values you pass to the function when you call it, and "parameters" to mean the list of lexical variables declared as part of the subroutine signature, through which you access the values that were passed to the subroutine.)
Perl 5 has rudimentary prototypes, but Perl 6 type signatures can be
much more expressive if you want them to be. The entire declaration
is much more flexible. Not only can you declare types and names of
individual parameters, you can add various traits to the parameters,
such as rw above. You can add traits to the subroutine itself,
and declare the return type. In fact, at some level or other,
the subroutine's signature and return type are also just traits.
You might even consider the body of the subroutine to be a trait.
For those of you who have been following Perl 6 development, you'll
wonder why we're now calling these "traits" rather than "properties".
They're all really still properties under the hood, but we're trying to
distinguish those properties that are expected to be set on containers
at compile time from those that are expected to be set on values
at run time. So compile-time properties are now called "traits".
Basically, if you declare it with is, it's a trait, and if you add
it onto a value with but, it's a property. The main reason for
making the distinction is to keep the concepts straight in people's
minds, but it also has the nice benefit of telling the optimizer
which properties are subject to change, and which ones aren't.
A given trait may or may not be implemented as a method on the underlying container object. You're not supposed to care.
There are actually several syntactic forms of trait:
rule trait :w {
is <ident>[\( <traitparam> \)]?
| will <ident> <closure>
| of <type>
| returns <type>
}
(We're specifying the syntax here using Perl 6 regexes. If you don't know about those, go back and read Apocalypse 5.)
A <type> is actually allowed to be a junction of types:
sub foo returns Int|Str {...}
The will syntax specifically introduces a closure trait without
requiring the extra parens that is would. Saying:
will flapdoodle { flap() and doodle() }
is exactly equivalent to:
is flapdoodle({ flap() and doodle() })
but reads a little better. More typically you'll see traits like:
will first { setup() }
will last { teardown() }
The final block of a subroutine declaration is the "do" trait. Saying:
sub foo { ... }
is like saying:
sub foo will do { ... }
Note however that the closure eventually stored under the do trait
may in fact be modified in various ways to reflect argument processing,
exception handling, and such.
We'll discuss the of and returns traits later when we discuss
types. Back to syntax.
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 |

