Apocalypse 4
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 04 for the latest information.
RFC 199: Short-circuiting built-in functions and user-defined subroutines
First I should note in passing that it is likely that
my ($found) = grep { $_ == 1 } (1..1_000_000);
will be smart enough to stop on the first one without additional hints, since the left side will only demand one value of the right side.
However, we do need to unify the behaviors of built-ins with user-defined control structures. From an internal point of view, all of these various ways of exiting a block will be unified as exceptions.
It will be easy enough for a user-defined subroutine to catch the appropriate exceptions and do the right thing. For instance, to implement a loop wrapper (ignoring parser issues), you might write something like this:
sub mywhile ($keyword, &condition, &block) {
my $l = $keyword.label;
while (&condition()) {
&block();
CATCH {
my $t = $!.tag;
when X::Control::next { die if $t && $t ne $l); next }
when X::Control::last { die if $t && $t ne $l); last }
when X::Control::redo { die if $t && $t ne $l); redo }
}
}
}
|
|
Remember that those die calls are just rethrows of the current
exception to get past the current try scope (the while in this
case).
How a block gets a label in general is an interesting question.
It's all very well to say that the keyword is the label, but that
doesn't help if you have two nested constructs with the same name.
In Perl 5, labels are restricted to being at the beginning of the
statement, but then how do you label a grep? Should there be some
way of specifying a label on a keyword rather than on a statement?
We could end up with something like this:
my $found = grep:NUM { $_ == 1 and last NUM: $_ } (1..1_000_000);
On the other hand, considering how often this feature is (not) going to used, I think we can stick with the tried-and-true statement label:
my $found = do { NUM: grep { $_ == 1 and last NUM: $_ } (1..1_000_000) };
This has the advantage of matching the label syntax with a colon on the end in both places. I like that.
I don't think every block should implicitly have a way to return, or we'll have difficulty optimizing away blocks that don't do anything blockish. That's because setting up a try environment is always a bit blockish, and does in fact impose some overhead that we'd just as soon avoid when it's unnecessary.
However, it's probably okay if certain constructs that would know how to deal with a label are implicitly labelled by their keyword name when they don't happen to have an explicit label. So I think we can allow something like:
last grep: $_
Despite its appearance, that is not a method call, because grep
is not a predefined class. What we have is a unary operator last
that is taking an adverbial modifier specifying what to return from
the loop.
The interesting policy question as we go on will be whether a given
construct responds to a given exception or not. Some exceptions
will have to be restricted in their use. For instance, we should
probably say that only explicit sub declarations may respond to a
return. People will expect return to exit the subroutine they
think they're in, even if there are blocks floating around that are
actually closures being interpreted elsewhere. It might be considered
antisocial for closure interpreters like grep or map or sort
to trap X::Control::return sooner than the user expects.
As for using numbers instead of labels to indicate how many levels to break out of, that would be fine, except that I don't believe in breaking out by levels. If the problem is complex enough that you need to break out more than one level, you need a name, not a number. Then it doesn't matter if you refactor your code to have more block levels or less. I find I frequently have to refactor my code that way.
It's possible to get carried away and retrofit grep and map
with every conceivable variety of abort, retry, accept, reject, reduce,
reuse, recycle, or whatever exception. I don't think that's necessary.
There has to be some reason for writing your own code occasionally.
If we get rid of all the reasons for writing user-defined subroutines,
we might as well pack our bags and go home. But it's okay at minimum
to treat a looping construct like a loop.
RFC 006: Lexical variables made default
This RFC proposes that strict vars should be on by default. This is
motivated by the desire that Perl better support (or cajole, in this
case) the disciplines that enable successful programming in the large.
This goal is laudable.
However, the programming-in-the-small advocates also have a valid point: they don't want to have to go to all the trouble of turning off strictures merely to write a succinct one-liner, since keystrokes are at a premium in such programming, and in fact the very strictures that increase clarity in large programs tend to decrease clarity in small programs.
So this is one of those areas where we desire to have it both ways,
and in fact, we pretty much can. The only question is where to draw
the line. Some discussion suggested that only programs specified on
the command line via the -e switch should be exempt from stricture.
But I don't want to force every little file-based script into the
large model of programming. And we don't need to.
Large programming requires the definition of modules and classes.
The typical large program will (or should) consist mostly of modules
and classes. So modules and classes will assume strict vars.
Small programming does not generally require the definition of modules
and classes, though it may depend on existing modules and classes.
But even small programs that use a lot of external modules and classes
may be considered throw-away code. The very fact that the main code
of a program is not typically reused (in the sense that modules and
classes are reused) means that there is where we should draw the
line. So in Perl 6, the main program will not assume strict vars,
unless you explicitly do something to turn it on, such as to declare
"class Main".
RFC 330: Global dynamic variables should remain the default
This is fine for the main program, but modules and classes should be held
to the higher standard of use strict.
RFC 083: Make constants look like variables
It's important to keep in mind the distinction between variables and values. In a pure OO environment, variables are merely references to values, and have no properties of their own--only the value itself would be able to say whether it is constant. Some values are naturally constant, such as a literal string, while other values could be marked constant, or created without methods that can modify the object, or some such mechanism. In such an environment, there is little use for properties on variables. Any time you put a property on a variable, it's potentially lying about its value.
However, Perl does not aspire to be a pure OO environment. In Perl-think, a variable is not merely a container for a value. Rather, a variable provides a "view" of a value. Sometimes that view could even be construed as a lie. That's okay. Lying to yourself is a useful survival skill (except when it's not). We find it necessary to repeat "I think I can" to ourselves precisely when we think we can't. Conversely, it's often valuable psychologically to treat possible activities as forbidden. Abstinence is easier to practice if you don't have to decide anew every time there's a possible assignation, er, I mean, assignment.
Constant declarations on variables fall into this category. The value itself may or may not naturally be constant, but we will pretend that it is. We could in theory go farther than that. We could check the associated object to make sure that it is constant, and blow up if it's not, but that's not necessary in this case for consistent semantics. Other properties may be stricter about this. If you have a variable property that asserts a particular shape of multidimensional array, for instance, the object in question had better be able to supply semantics consistent with that view, and it's probably a good idea to blow up sooner rather than later if it can't. This is something like strong typing, except that it's optional, because the variable property itself is optional.
Nevertheless, the purpose of these variable properties is to allow the compiler to deduce things about the program that it could not otherwise deduce, and based on those deductions, produce both a more robust and more efficient compile-time interpretation of the semantics of the program. That is to say, you can do more optimizations without compromising safety. This is obviously true in the case of inlining constants, but the principle extends to other variable properties as well.
The proposed syntax is fine, except that we'll be using is instead of
: for properties, as discussed in Apocalypse 2. (And it's constant,
not const.)
RFC 337: Common attribute system to allow user-defined, extensible attributes
As already revealed in Apocalypse 2, attributes will be known as
"properties" in Perl 6, to avoid confusion with existing OO
nomenclature for instance variables. Also, we'll use the is keyword
instead of the colon.
Setting properties on array and hash elements bothers me, particularly
when those properties have names like "public" and "private". This
seems to me to be an attempt to paper over the gap of some missing OO
functionality. So instead, I'd rather keep arrays and hashes mostly
for homogenous data structures, and encourage people to use objects to
store data of differing types. Then public and private can be
properties of object attributes, which will look more like real
variables in how they are declared. And we won't have to worry about
the meaning of my @foo[2], because that still won't be allowed.
Again, we need to be very clear that the object representing the variable is different than any objects contained by the variable. When we say
my Dog @dogpound is loud;
we mean that the individual elements of @dogpound are of type Dog, not
that the array variable is of type Dog. But the loud property
applies to the array, not to the dogs in the array. If the array
variable needs to have a type, it can be supplied as if it were a
property:
my Dog @dogpound is DogPound is loud;
That is, if a property is the name of a known package/class, it is
taken to be a kind of tie. Given the declaration above, the
following is always true:
@dogpound.is.loud
since the loud is a property of the array object, even if it
contains no dogs. It turns out that
@dogpound.is.DogPound
is also true. This does not do an isa lookup. For that, say:
@dogpound.isa(Pound)
Note that you can use:
@dogpound =~ Dog
to test the individual elements for Doghood.


