Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 6
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 06 for the latest information.

RFC 57: Subroutine prototypes and parameters

We ended up with something like this proposal, though with some differences. Instead of =, we're using => to specify names because it's a pair constructor in Perl 6, so there's little ambiguity with positional parameters. Unless a positional parameter is explicitly declared with a Pair or Hash type, it's assumed not to be interested in named arguments.

Also, as the RFC points out, use of = would be incompatible with lvalue subs, which we're supporting.

The RFC allows for mixing of positional and named parameters, both in declaration and in invocation. I think such a feature would provide far more confusion than functionality, so we won't allow it. You can always process your own argument list if you want to. You could even install your own signature handler in place of Perl's.

The RFC suggests treating the first parameter with a default as the first optional parameter. I think I'd rather mark optional parameters explicitly, and then disallow defaults on required parameters as a semantic constraint.

It's also suggested that something like:


    sub waz ($one, $two, 
             $three = add($one, $two), 
             $four  = add($three, 1)) {
        ...
    }

be allowed, where defaults can refer back to previous parameters. It seems as though we could allow that, if we assume that symbols are introduced in signatures as soon as they are seen. That would be consistent with how we've said my variables are introduced. It does mean that a prototype that defaults to the prior $_ would have to be written like this:


    $myclosure = sub ($_ = $OUTER::_) { ... }

On the other hand, that's exactly what:


    $myclosure = { ... }

means in the absence of placeholder variables, so the situation will likely not arise all that often. So I'd say yes, defaults should be able to refer back to previous parameters in the same signature, unless someone thinks of a good reason not to.

As explained in Apocalypse 4, $OUTER:: is for getting at an outer lexical scope. This ruling about formal parameters means that, effectively, the lexical scope of a subroutine "starts to begin" where the formal parameters are declared, and "finishes beginning" at the opening brace. Whether a given symbol in the signature actually belongs to the inner scope or the outer scope depends on whether it's already been introduced by the inner scope. Our sub above needed $OUTER::_ because $_ had already been introduced as the name of the first argument. Had some other name been introduced, $_ might still be taken to refer to the outer $_:


    $myclosure = sub ($arg = $_) { ... }

If so, use of $OUTER::_ would be erroneous in that case, because the subroutine's implicit $_ declaration wouldn't happen till the opening curly, and instead of getting $OUTER::_, the user would unexpectedly be getting $OUTER::OUTER::_, as it were. So instead, we'll say that the implicit introduction of the new sub's $_ variable always happens after the <subintro> and before the <signature>, so any use of $_ as a default in a signature or as an argument to a property can only refer to the subroutine's own topic, if any. To refer to any external $_ you must say either $CALLER::_ or $OUTER::_. This approach seems much cleaner.

RFC 160: Function-call named parameters (with compiler optimizations)

For efficiency, we have to be able to hoist the semantics from the signature into the calling module when that's practical, and that has to happen at compile time. That means the information has to be in the signature, not embedded in a fields() function within the body of the subroutine. In fact, my biggest complaint about this RFC is that it arbitrarily separates the prototype characters, the parameter names, and the variable names. That's a recipe for things getting out of sync.

Basically, this RFC has a lot of the right ideas, but just doesn't go far enough in the signature direction, based on the (at the time) laudable notion that we were interested in keeping Perl 6 as close to Perl 5 as possible. Which turned out not to be quite the case. :-) Our new signatures look more hardwired than the attribute syntax proposed here, but it's all still very hookable underneath via the sub and parameter traits. And everything is together that should be together.

Although the signature is really just a trait underneath, I thought it important to have special syntax for it, just as there's special syntax for the body of the function. Signatures are very special traits, and people like special things to look special. It's just more of those darn psychological reasons that keep popping up in the design of Perl.

Still and all, the current design is optimized for many of the same sensitivities described in this RFC.

RFC 128: Subroutines: Extend subroutine contexts to include name parameters and lazy arguments

This RFC also has lots of good ideas, but tends to stay a little too close to Perl 5 in various areas where I've decided to swap the defaults around. For instance, marking reference parameters in prototypes rather than slurpy parameters in signatures, identifying lazy parameters rather than flattening, and defaulting to rw (autovivifying lvalue args) rather than constant (rvalue args).

Context classes are handled by the automatic coercion to references within scalar context, and by type junctions.

Again, I don't buy into two-pass, fill-in-the-blanks argument processing.

Placeholders are now just for argument declaration, and imply no currying. Currying on the other hand is done with an explicit .assuming method, which requires named args that will be bound to the corresponding named parameters in the function being curried.

Or should I say functions? When module and class writers write systems of subroutines or methods, they usually go to great pains to make sure all the parameter names are consistent. Why not take advantage of that?

So currying might even be extended to classes or modules, where all methods or subs with a given argument name are curried simultaneously:


    my module MyIO ::= (use IO::Module).assuming(ioflags => ":crlf");
    my class UltAnswer ::= (use Answer a,b,c).assuming(answer => 42);

If you curry a class's invocant, it would turn the class into a module instead of another class, since there are no longer any methods if there are no invocants:


    my module UltAnswer ::=
        (use Answer a,b,c).assuming(self => new Answer: 42);

Or something like that. If you think this implies that there are class and module objects that can be sufficiently introspected to do this sort of chicanery, you'd be right. On the other hand, given that we'll have module name aliasing anyway to support running multiple versions of the same module, why not support multiple curried versions without explicit renaming of the module:


    (use IO::Module).assuming(ioflags => ":crlf");

Then for the rest of this scope, IO::Module really points to your aliased idea of IO::Module, without explicitly binding it to a different name. Well, that's for Apocalypse 11, really...

One suggestion from this RFC I've taken to heart, which is to banish the term "prototype". You'll note we call them signatures now. (You may still call Perl 5's prototypes "prototypes", of course, because Perl 5's prototypes really were a prototype of signatures.)

RFC 344: Elements of @_ should be read-only by default

I admit it, I waffled on this one. Up until the last moment, I was going to reject it, because I wanted @_ to work exactly like it does in Perl 5 in subs without a signature. It seemed like a nice sop towards backward compatibility.

But when I started writing about why I was rejecting it, I started thinking about whether a sig-less sub is merely a throwback to Perl 5, or whether we'll see it continue as a viable Perl 6 syntax. And if the latter, perhaps it should be designed to work right rather than merely to work the same. The vast majority of subroutines in Perl 5 refrain from modifying their arguments via @_, and it somehow seems wrong to punish such good deeds.

So I changed my mind, and the default signature on a sub without a signature is simply (*@_), meaning that @_ is considered an array of constants by default. This will probably have good effects on performance, in general. If you really want to write through the @_ parameter back into the actual arguments, you'll have to declare an explicit signature of (*@_ is rw).

The Perl5-to-Perl6 translator will therefore need to translate:


    sub {...}

to:


    sub (*@_ is rw) {...}

unless it can be determined that elements of @_ are not modified within the sub. (It's okay to shift a constant @_ though, since that doesn't change the elements passed to the call; remember that for slurpy arrays the implied "is constant" or explicit "is rw" distributes to the individual elements.)

RFC 194: Standardise Function Pre- and Post-Handling

Yes, this needs to be standardized, but we'll be generalizing to the notion of wrappers, which can automatically keep their pre and post routines in sync, and, more importantly, keep a single lexical scope across the related pre and post processing. A wrapper is installed with the .wrap method, which can have optional parameters to tell it how to wrap, and which can return an identifier by which the particular wrapper can be named when unwrapping or otherwise rearranging the wrappings. A wrapper automatically knows what function it's wrapped around, and invoking the call builtin automatically invokes the next level routine, whether that's the actual routine or another layer of wrapper. That does matter, because with that implicit knowledge call doesn't need to be given the name of the routine to invoke.

The implementation is dependent on what happens to typeglobs in Perl 6, how does one inspect and modify the moral equivalent of the symbol table?

This is not really a problem, since we've merely split the typeglob up into separate entries.

Also: what will become of prototypes? Will it become possible to declare return types of functions?

Yes. Note that if you do introspection on a sub ref, by default you're going to get the signature and return type of the actual routine, not of any wrappers. There needs to be some method for introspecting the wrappers as well, but it's not the default.

As pointed out in [JP:HWS] certain intricacies are involved: what are the semantic of caller()? Should it see the prehooks? If yes, how?

It seems to me that sometimes you want to see the wrappers, and sometimes you don't. I think caller needs some kind of argument that says which levels to recognize and which levels to ignore. It's not necessarily a simple priority either. One invocation may want to find the innermost enclosing loop, while another might want the innermost enclosing try block. A general matching term will be supplied on such calls, defaulting to ignore the wrappers.

How does this relate to the proposed generalized want() [DC:RFC21]?

The want() function can be viewed as based on caller(), but with a different interface to the information available at the the particular call level.

I worry that generalized wrappers will make it impossible to compile fast subroutine calls, if we always have to allow for run-time insertion of handlers. Of course, that's no slower than Perl 5, but we'd like to do better than Perl 5. Perhaps we can have the default be to have wrappable subs, and then turn that off with specific declarations for speed, such as "is inline".

RFC 271: Subroutines : Pre- and post- handlers for subroutines

I find it odd to propose using PRE for something with side effects like flock. Of course, this RFC was written before FIRST blocks existed...

On the other hand, it's possible that a system of PRE and POST blocks would need to keep "dossiers" of its own internal state independent of the "real" data. So I'm not exactly sure what the effective difference is between PRE and FIRST. But we can always put a PRE into a lexical wrapper if we need to keep info around till the POST. So we can keep PRE and POST with the semantics of simply returning boolean expressions, while FIRST and LAST are evaluated primarily for side effects.

You might think that you wouldn't need a signature on any pre or post handler, since it's gonna be the same as the primary. However, we have to worry about multimethods of the same name, if the handlers are defined outside of the subroutine. Again, embedding PRE and POST blocks either in the routine itself or inside a wrapper around the routine should handle that. (And turning the problem into one of being able to generate a reference to a multimethod with a particular signature, in essence, doing method dispatch without actually dispatching at the end.)

My gut feeling is that $_[-1] is a bad place to keep the return value. With the call interface we're proposing, you just harvest the return value of call if you're interested in the return value. Or perhaps this is a good place for a return signature to actually have formal variables bound to the return values.

Also, defining pre and post conditions in terms of exceptions is probably a mistake. If they're just boolean expressions, they can be ANDed and ORed together more easily in the approved DBC fashion.

We haven't specified a declarative form of wrapper, merely a .wrap method that you can call at run time. However, as with most of Perl, anything you can do at run time, you can also do at compile time, so it'd be fairly trivial to come up with a syntax that used a wrap keyword in place of a sub:


    wrap split(Regex ?$re, ?$src = $CALLER::_, ?$limit = Inf) {
        print "Entering split\n";
        call;
        print "Leaving split\n";
    }

I keep mistyping "wrap" as "warp". I suppose that's not so far off, actually...

RFC 21: Subroutines: Replace wantarray with a generic want function

Overall, I like it, except that it's reinventing several wheels. It seems that this has evolved into a powerful method for each sub to do its own overloading based on return type. How does this play with a more declarative approach to return types? I dunno. For now we're assuming multmethod dispatch only pays attention to argument types. We might get rid of a lot of calls to want if we could dispatch on return type as well. Perhaps we could do primary dispatch on the arguments and then do tie-breaking on return type when more then one multimethod has the same parameter profile.

I also worry a bit that we're assuming an interpreter here that can keep track of all the context information in a way suitable for searching by the called subroutine. When running on top of a JVM or CLR, this info might not be convenient to provide, and I'd hate to have to keep a descriptor of every call, or do some kind of double dispatch, just because the called routine might want to use want(), or might want to call another routine that might want to use want, or so on. Maybe the situation is not that bad.

I sometimes wonder if want should be a method on the context object:


    given caller.want {...}

or perhaps the two could be coalesced into a single call:


    given context { ... }

But for the moment let's assume for readability that there's a want function distinct from caller, though with a similar signature:


    multi *want (?$where = &CALLER::_, Int +$skip = 0, Str +$label)
        returns WantContext {...}

As with caller, calling want with no arguments looks for the context of the currently executing subroutine or method. Like return, it specifically ignores bare blocks and routines interpreting bare blocks, and finds the context for the lexically enclosing explicit sub or method declaration, named by &_.

You'll note that unlike in the proposal, we don't pass a list to want, so we don't support the implicit && that is proposed for the arguments to want. But that's one of the re-invented wheels, anyway, so I'm not too concerned about that. What we really want is a want that works well with smart matching and switch statements.

RFC 23: Higher order functions

In general, this RFC proposes some interesting semantic sugar, but the rules are too complicated. There's really no need for special numbered placeholders. And the special ^_ placeholder is too confusing. Plus we really need regular sigils on our placeholder variables so we can distinguish $^x from @^x from %^x.

But the main issue is that the RFC is confusing two separate concepts (though that can be blamed on the languages this idea was borrowed from). Anyway, it turns out we'll have an explicit pre-binding method called .assuming for actual currying.

We'll make the self-declaring parameters a separate concept, called placeholder variables. They don't curry. Some of the examples of placeholders in the RFC are actually replaced by topics and junctions in our smart matching mode, but there are still lots of great uses for placeholder variables.

RFC 176: subroutine / generic entity documentation

This would be trivial to do with declared traits and here docs. But it might be better to use a POD directive that is accessible to the program. An entity might even have implicit traits that bind to nearby chunks of the right sort. Maybe we could get Don Knuth to come up with something literate...

RFC 298: Make subroutines' prototypes accessible from Perl

While I'm all in favor of a sub's signature being available for inspection, this RFC goes beyond that to make indirection in the signature the norm. This seems to be a solution in search of a problem. I'm not sure the confusion of the indirection is worth the ability to factor out common parameter lists. Certainly parameter lists must have introspection, but using it to set the prototype seems potentially confusing. That being said, the signatures are just traits, so this may be one of those things that is permitted, but not advised, like shooting your horse in the middle of the desert, or chewing out your SO for burning dinner. Implicit declaration of lexically scoped variables will undoubtedly be considered harmful by somebody someday. [Damian says, "Me. Today."]

RFC 334: Perl should allow specially attributed subs to be called as C functions

Fine, Dan, you implement it. ;-)

Did I claim I ignore the names of RFC authors? Hmm.

The syntax for the suggested:


    sub foo : C_visible("i", "iii") {#sub body}

is probably a bit more verbose in real life:


    my int sub foo (int $a, int $b, int $c)
         is callable("C","Python","COBOL") { ... }

If we can't figure out the "i" and "iii" bits from introspection of the signature and returns traits, we haven't done introspection right. And if we're gonna have an optional type system, I can't think of a better place to use it than for interfaces to optional languages.

Acknowledgements

This work was made possible by a grant from the Perl Foundation. I would like to thank everyone who made this dissertation possible by their generous support. So, I will...

Thank you all very, very, very, very much!!!

I should also point out that I would have been stuck forever on some of these design issues without the repeated prodding (as in cattle) of the Perl 6 design team. So I would also like to publicly thank Allison, chromatic, Damian, Dan, Hugo, Jarkko, Gnat, and Steve. Thanks, you guys! Many of the places we said "I" above, I should have said "we".

I'd like to publicly thank O'Reilly & Associates for facilitating the design process in many ways.

I would also like to thank my wife Gloria, but not publicly.

Future Plans

From here on out, the Apocalypses are probably going to be coming out in priority order rather than sequential order. The next major one will probably be Apocalypse 12, Objects, though it may take a while since (like a lot of people in Silicon Valley) I'm in negative cash flow at the moment, and need to figure out how to feed my family. But we'll get it done eventually. Some Apocalypses might be written by other people, and some of them hardly need to be written at all. In fact, let's write Apocalypse 7 right now...

Apocalypse 7: Formats

Gone from the core. See Damian.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16

Next Pagearrow