Apocalypse 4
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 04 for the latest information.
RFC 088: Omnibus Structured Exception/Error Handling Mechanism
This RFC posits some requirements for exception handling (all of which I agree with), but I do have some additional requirements of my own:
|
|
- The exception-catching syntax must be considered a form of switch statement.
- It should be easy to turn any kind of block into a "try" block, especially a subroutine.
-
Even
try-less try blocks must also be able to specify mandatory cleanup on exit. - It should be relatively easy to determine how much cleanup is necessary regardless of how a block was exited.
-
It must be possible to base the operation of
return,next, andlaston exception handling. - The cleanup mechanism should mesh nicely with the notions of post condition processing under design-by-contract.
- The exception-trapping syntax must not violate encapsulation of lexical scopes.
- At the same time, the exception-trapping syntax should not force declarations out of their natural scope.
-
Non-linear control flow must stand out visually, making good use of
block structure, indentation and even keyword case.
BEGINandENDblocks are to be considered prior art. - Non-yet-thrown exceptions must be a useful concept.
- Compatibility with the syntax of any other language is specifically NOT a goal.
RFC 88 is massive, weighing in at more than 2400 lines. Annotating the entire RFC would make this Apocalypse far too big. ("Too late!" says Damian.) Nonetheless, I will take the approach of quoting various bits of the RFC and recasting those bits to work with my additional requirements. Hopefully this will convey my tweaks most succinctly.
Here's what the RFC gives as its first example:
exception 'Alarm';
try {
throw Alarm "a message", tag => "ABC.1234", ... ;
}
catch Alarm => { ... }
catch Error::DB, Error::IO => { ... }
catch $@ =~ /divide by 0/ => { ... }
catch { ... }
finally { ... }
Here's how I see that being written in Perl 6:
my class X::Alarm is Exception { } # inner class syntax?
try {
throw X::Alarm "a message", tag => "ABC.1234", ... ;
CATCH {
when X::Alarm { ... }
when Error::DB, Error::IO { ... }
when /divide by 0/ { ... }
default { ... }
}
POST { ... }
}
The outer block does not have to be a try block. It could be a
subroutine, a loop, or any other kind of block, including an eval
string or an entire file. We will call such an outer block a try
block, whether or not there is an explicit try keyword.
The biggest change is that the various handlers are moved inside
of the try block. In fact, the try keyword itself is mere
documentation in our example, since the presence of a CATCH or POST block
is sufficient to signal the need for trapping. Note that the POST
block is completely independent of the CATCH block. (The POST block
has a corresponding PRE block for design-by-contract programmers.) Any
of these blocks may be placed anywhere in the surrounding block--they
are independent of the surrounding control flow. (They do have to
follow any declarations they refer to, of course.) Only one CATCH is
allowed, but any number of PRE and POST blocks. (In fact, we may well
encourage ourselves to place POST blocks near the constructors to be
cleaned up after.) PRE blocks within a particular try block are
evaluated in order before anything else in the block. POST blocks will
be evaluated in reverse order, though order dependencies between POST
blocks are discouraged. POST blocks are evaluated after everything
else in the block, including any CATCH.
A try {} without a CATCH is equivalent to Perl 5's eval {}.
(In fact, eval will go back to evaluating only strings in Perl 6, and
try will evaluate only blocks.)
The CATCH and POST blocks are naturally in the lexical scope of
the try block. They may safely refer to lexically scoped variables
declared earlier in the try block, even if the exception is thrown
during the elaboration sequence. (The run-time system will guarantee
that individual variables test as undefined (and hence false) before
they are elaborated.)
The inside of the CATCH block is precisely the syntax of a switch
statement. The discriminant of the switch statement is the exception
object, $!. Since the exception object stringifies to the error
message, the when /divide by 0/ case need not be explicitly
compared against $!. Likewise, explicit mention of a declared
class implies an "isa" lookup, another built-in feature of the new
switch statement.
In fact, a CATCH of the form:
CATCH {
when xxx { ... } # 1st case
when yyy { ... } # 2nd case
... # other cases, maybe a default
}
means something vaguely like:
BEGIN {
%MY.catcher = {
given current_exception() -> $! {
when xxx { ... } # 1st case from above
when yyy { ... } # 2nd case from above
... # other cases, maybe a default
die; # rethrow $! as implicit default
}
$!.markclean; # handled cleanly, in theory
}
}
The unified "current exception" is $!. Everywhere this RFC uses
$@, it should be read as $! instead. (And the too-precious @@
goes away entirely in favor of an array stored internally to the $!
object that can be accessed as @$! or $![-1].) (For the legacy
Perl 5 parser, $@ and $? will be emulated, but that will
not be available to the Perl 6 parser.)
Also note that the CATCH block implicitly supplies a rethrow (the
die above) after the cases of the switch statement. This will not
be reached if the user has supplied an explicit default case, since
the break of that default case will always bypass the implicit
die. And if the switch rethrows the exception (either explicitly or
implicitly), $! is not marked as clean, since the die will bypass
the code that marks the exception as "cleanly caught". It should be
considered an invariant that any $! in the normal control flow outside
of a CATCH is considered "cleanly caught", according to the definition
in the RFC. Unclean exceptions should only be seen inside CATCH
blocks, or inside any POST blocks that have to execute while an
exception is propagating to an outer block because the current try
block didn't handle it. (If the current try block does successfully
handle the exception in its CATCH, any POST blocks at the same level
see a $! that is already marked clean.)
RFC:
-
eval {die "Can't foo."}; print $@;continues to work as before.
That will instead look like
try { die "Can't foo" }; print $!;
in Perl 6. A try with no CATCH:
try { ... }
is equivalent to:
try { ... CATCH { default { } } }
(And that's another reason I didn't want to use else for the default
case of a switch statement--an else without an if looks really
bizarre...)
Just as an aside, what I'm trying to do here is untangle the exception
trapping semantics of eval from its code parsing and running semantics.
In Perl 6, there is no eval {}. And eval $string really
means something like this:
try { $string.parse.run }
RFC:
- This RFC does not require core Perl functions to use exceptions for signalling errors.
However, Perl core functions will by default signal failure using
unthrown proto-exceptions (that is, interesting values of undef)
that can easily be turned into thrown exceptions via die.
By "interesting values of undef", I don't mean undef with properties.
I mean full-fledged exception objects that just happen to return false
from their .defined and .true methods. However, the .str
method successfully returns the error message, and the .int method
returns the error code (if any). That is, they do stringify and numify
like $! ought to. An exception becomes defined and true when it
is thrown. (Control exceptions become false when cleanly caught,
to avoid spoofing old-style exception handlers.)
RFC:
-
This means that all exceptions propagate unless they are cleanly
caught, just as in Perl 5. To prevent this, use:
try { fragile(); } catch { } # Go on no matter what.
This will simply be:
try { fragile; }
But it means the same thing, and it's still the case that all
exceptions propagate unless they are cleanly caught. In this case, the
caught exception lives on in $! as a new proto-exception that could
be rethrown by a new die, much as we used to use $@. Whether an
exception is currently considered "cleanly caught" can be reflected in
the state of the $! object itself. When $! passes through the
end of a CATCH, it is marked as clean, so that subsequent attempts to
establish a new $! know that they can clear out the old @$!
stack. (If the current $! is not clean, it should just add its
information without deleting the old information--otherwise an error in
a CATCH could delete the exception information you will soon be wanting
to print out.)
RFC:
try { ... } catch <test> => { ... } finally { ... }
Now:
{ ... CATCH { when <test> { ... } } POST { ... } }
(The angle brackets aren't really there--I'm just copying the RFC's metasyntax here.)
Note that we're assuming a test that matches the "boolean" entry from the switch dwimmery matrix. If not, you can always wrap closure curlies around the test:
{ ... CATCH { when { <test> } { ... } } POST { ... } }
That will force the test to be called as a subroutine that ignores its
argument, which happens to be $!, the exception object. (Recall
that the implied "given" of a CATCH statement sets $! as the given
value. That given value is automatically passed to any "when" cases that
look like subroutines or closures, which are free either to ignore the passed
value, or access it as $_ or $^a.)
Or you might just prefer to use the unary true operator:
{ ... CATCH { when true <test> { ... } } POST { ... } }
I personally find that more readable than the closure.
RFC:
- The test argument of the catch clause is optional, and is described below.
The test argument of a when clause is NOT optional, since it would
be impossible to distinguish a conditional closure from the following
block. Use default for the default case.
RFC:
-
try,catch, andfinallyblocks should share the same lexical scope, in the way thatwhileandcontinuedo.
Actually, this is not so--the while and continue blocks
don't share the same lexical scope even in Perl 5. But we'll solve
this issue without "tunneling" in any case. (And we'll change the
continue block into a NEXT block that goes inside, so we
can refer to lexical variables from within it.)
RFC:
-
Note that
tryis a keyword, not a function. This is so that a;is not needed at the end of the last block. This is because atry/catch/finallynow looks more like anif/elsif/else, which does not require such a;, than like an eval, which does).
Again, this entire distinction goes away in Perl 6. Any expression
block that terminates with a right curly on its own line will be
interpreted as a statement block. And try is such an expression block.
RFC:
-
$@contains the current exception, and@@contains the current exception stack, as defined above underdie. Theunshiftrule guarantees that$@ == $@[0].
Why an unshift? A stack is most naturally represented in the
other direction, and I can easily imagine some kinds of handlers that
might well treat it like a stack, stripping off some entries and
pushing others.
Also, @@ is a non-starter because everything about the current
exception should all be in a single data structure. Keeping the info all
in one place makes it easy to rethrow an exception without losing data,
even if the exception was marked as cleanly caught. Furthermore I don't
think that the exception stack needs to be Huffman coded that badly.
So $! contains the current exception, and $!.stack accesses the
current exception stack. Through the magic of overloading, the $!
object can likely be used as an array even though it isn't one, in which
case @$! refers to that stack member. The push rule guarantees
that $!.id == $![-1].id.
RFC (speaking of the exception declaration):
-
If the given name matches
/::/, something like this happens:@MyError::App::DB::Foo::ISA = 'MyError::App::DB';and all non-existent parent classes are automatically created as inheriting from their parent, or
Exceptionin the tail case. If a parent class is found to exist and not inherit fromException, a run-time error exception is raised.
If I understand this, I think I disagree. A package ought to able to contain exceptions without being an exception class itself. There certainly ought to be a shorthand for exceptions within the current package. I suspect they're inner classes of some sort, or inner classes of an inner package, or some such.
RFC:
-
If the given name does not match
/::/(say it's justAlarm), this happens instead:@Alarm::ISA = 'Exception';This means that every exception class isa
Exception, even ifException::is not used at the beginning of the class name.
Ack! This could be really bad. What if two different modules declare
an Alarm exception with different derivations?
I think we need to say that unqualified exceptions are created within the current package, or maybe within the X subpackage of the current package. If we have inner classes, they could even be lexically scoped (and hence anonymous exceptions outside the current module). That might or might not be a feature.
I also happen to think that Exception is too long a name to prefix
most common exceptions, even though they're derived from that class. I
think exceptions will be better accepted if they have pithier names
like X::Errno that are derived from Exception:
our class X::Control is Exception;
our class X::Errno is Exception;
our class X::NumericError is Exception;
our class C::NEXT is X::Control;
our class E::NOSPC is X::Errno;
our class X::FloatingUnderflow is X::NumericError;
Or maybe those could be:
c::NEXT
e::NOSPC
x::FloatingUnderflow
if we decide uppercase names are too much like user-defined package
names. But that looks strange. Maybe we just reserve single letter
top-level package names for Perl. Heck, let's just reserve all
top-level package names for Perl. Er, no, wait... :-)
RFC 80 suggests that exception objects numerify to the system's errno number when those are available. That's a possibility, though by the current switch rules we might have to write
CATCH {
when +$ENOSPC { ... }
}
to force $ENOSPC to do a numeric comparison. It may well be better
to go ahead and make the errno numbers into exception classes, even
if we have to write something like this:
CATCH {
when X::ENOSPC { ... }
}
That's longer, but I think it's clearer. Possibly that's E::NOSPC
instead. But in any event, I can't imagine getting people to prefix
every exception with "Exception::". That's just gonna discourage
people from using exceptions. I'm quite willing to at least reserve
the X top-level class for exceptions. I think X:: is quite
sufficiently distinctive.
RFC:
try { my $f = open "foo"; ... } finally { $f and close $f; }
Now:
{
my $f = open "foo"; ...
POST { $f and close $f }
}
Note that $f is naturally in scope and guaranteed to have a
boolean value, even if the exception is thrown before the declaration
statement is elaborated! (An implementation need not allocate an
actual variable before the my. The code of the POST block could
always be compiled to know that $f is to be assumed undefined if
the allocating code has not yet been reached.)
We could go as far as to make
POST { close $f }
do something reasonable even without the guard. Maybe an undefined
object could "emulate" any method for you within a POST. Maybe try
is really a unary operator:
POST { try close $f }
Or some such. I dunno. This needs more thought along transactional lines...
Time passes...
Actually, now that I've thought on it, it would be pretty easy to put
wrappers around POST blocks that could do commit or rollback depending
on whether the block exits normally. I'd like to call them KEEP
and UNDO. KEEP blocks would only be executed if the block succeeded.
UNDO blocks would only be executed if the block failed. One could
even envision a syntax that ties the block to particular variable:
UNDO $f { close $f }
After all, like the CATCH block, all of these blocks are just fancy
BEGIN blocks that attach some meaning to some predefined property of
the block.


