Sign In/My Account | View Cart  
advertisement


Listen Print

Apocalypse 4
by Larry Wall | Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9

Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 04 for the latest information.

RFC 088: Omnibus Structured Exception/Error Handling Mechanism

This RFC posits some requirements for exception handling (all of which I agree with), but I do have some additional requirements of my own:

Learning PerlLearning Perl
By Randal L. Schwartz & Tom Phoenix
Table of Contents
Index
Sample Chapter
Read Online -- Safari

  • The exception-catching syntax must be considered a form of switch statement.
  • It should be easy to turn any kind of block into a "try" block, especially a subroutine.
  • Even try-less try blocks must also be able to specify mandatory cleanup on exit.
  • It should be relatively easy to determine how much cleanup is necessary regardless of how a block was exited.
  • It must be possible to base the operation of return, next, and last on exception handling.
  • The cleanup mechanism should mesh nicely with the notions of post condition processing under design-by-contract.
  • The exception-trapping syntax must not violate encapsulation of lexical scopes.
  • At the same time, the exception-trapping syntax should not force declarations out of their natural scope.
  • Non-linear control flow must stand out visually, making good use of block structure, indentation and even keyword case. BEGIN and END blocks are to be considered prior art.
  • Non-yet-thrown exceptions must be a useful concept.
  • Compatibility with the syntax of any other language is specifically NOT a goal.

RFC 88 is massive, weighing in at more than 2400 lines. Annotating the entire RFC would make this Apocalypse far too big. ("Too late!" says Damian.) Nonetheless, I will take the approach of quoting various bits of the RFC and recasting those bits to work with my additional requirements. Hopefully this will convey my tweaks most succinctly.

Here's what the RFC gives as its first example:

    exception 'Alarm';
    try {
        throw Alarm "a message", tag => "ABC.1234", ... ;
        }

    catch Alarm => { ... }

    catch Error::DB, Error::IO => { ... }

    catch $@ =~ /divide by 0/ => { ... }

    catch { ... }

    finally { ... }

Here's how I see that being written in Perl 6:

    my class X::Alarm is Exception { }     # inner class syntax?
    try {
        throw X::Alarm "a message", tag => "ABC.1234", ... ;
        CATCH {
            when X::Alarm             { ... }
            when Error::DB, Error::IO { ... }
            when /divide by 0/        { ... }
            default                   { ... }
        }
        POST { ... }
    }

The outer block does not have to be a try block. It could be a subroutine, a loop, or any other kind of block, including an eval string or an entire file. We will call such an outer block a try block, whether or not there is an explicit try keyword.

The biggest change is that the various handlers are moved inside of the try block. In fact, the try keyword itself is mere documentation in our example, since the presence of a CATCH or POST block is sufficient to signal the need for trapping. Note that the POST block is completely independent of the CATCH block. (The POST block has a corresponding PRE block for design-by-contract programmers.) Any of these blocks may be placed anywhere in the surrounding block--they are independent of the surrounding control flow. (They do have to follow any declarations they refer to, of course.) Only one CATCH is allowed, but any number of PRE and POST blocks. (In fact, we may well encourage ourselves to place POST blocks near the constructors to be cleaned up after.) PRE blocks within a particular try block are evaluated in order before anything else in the block. POST blocks will be evaluated in reverse order, though order dependencies between POST blocks are discouraged. POST blocks are evaluated after everything else in the block, including any CATCH.

A try {} without a CATCH is equivalent to Perl 5's eval {}. (In fact, eval will go back to evaluating only strings in Perl 6, and try will evaluate only blocks.)

The CATCH and POST blocks are naturally in the lexical scope of the try block. They may safely refer to lexically scoped variables declared earlier in the try block, even if the exception is thrown during the elaboration sequence. (The run-time system will guarantee that individual variables test as undefined (and hence false) before they are elaborated.)

The inside of the CATCH block is precisely the syntax of a switch statement. The discriminant of the switch statement is the exception object, $!. Since the exception object stringifies to the error message, the when /divide by 0/ case need not be explicitly compared against $!. Likewise, explicit mention of a declared class implies an "isa" lookup, another built-in feature of the new switch statement.

In fact, a CATCH of the form:

    CATCH { 
        when xxx { ... }          # 1st case
        when yyy { ... }          # 2nd case
        ...                       # other cases, maybe a default
    }
 means something vaguely like:
    BEGIN {
        %MY.catcher = {
            given current_exception() -> $! {
                when xxx { ... }          # 1st case from above
                when yyy { ... }          # 2nd case from above
                ...                       # other cases, maybe a default
                die;            # rethrow $! as implicit default
            }
            $!.markclean;       # handled cleanly, in theory
        }
    }

The unified "current exception" is $!. Everywhere this RFC uses $@, it should be read as $! instead. (And the too-precious @@ goes away entirely in favor of an array stored internally to the $! object that can be accessed as @$! or $![-1].) (For the legacy Perl 5 parser, $@ and $? will be emulated, but that will not be available to the Perl 6 parser.)

Also note that the CATCH block implicitly supplies a rethrow (the die above) after the cases of the switch statement. This will not be reached if the user has supplied an explicit default case, since the break of that default case will always bypass the implicit die. And if the switch rethrows the exception (either explicitly or implicitly), $! is not marked as clean, since the die will bypass the code that marks the exception as "cleanly caught". It should be considered an invariant that any $! in the normal control flow outside of a CATCH is considered "cleanly caught", according to the definition in the RFC. Unclean exceptions should only be seen inside CATCH blocks, or inside any POST blocks that have to execute while an exception is propagating to an outer block because the current try block didn't handle it. (If the current try block does successfully handle the exception in its CATCH, any POST blocks at the same level see a $! that is already marked clean.)

RFC:

eval {die "Can't foo."}; print $@; continues to work as before.

That will instead look like

    try { die "Can't foo" }; print $!;

in Perl 6. A try with no CATCH:

    try { ... }

is equivalent to:

    try { ... CATCH { default { } } }

(And that's another reason I didn't want to use else for the default case of a switch statement--an else without an if looks really bizarre...)

Just as an aside, what I'm trying to do here is untangle the exception trapping semantics of eval from its code parsing and running semantics. In Perl 6, there is no eval {}. And eval $string really means something like this:

    try { $string.parse.run }

RFC:

This RFC does not require core Perl functions to use exceptions for signalling errors.

However, Perl core functions will by default signal failure using unthrown proto-exceptions (that is, interesting values of undef) that can easily be turned into thrown exceptions via die. By "interesting values of undef", I don't mean undef with properties. I mean full-fledged exception objects that just happen to return false from their .defined and .true methods. However, the .str method successfully returns the error message, and the .int method returns the error code (if any). That is, they do stringify and numify like $! ought to. An exception becomes defined and true when it is thrown. (Control exceptions become false when cleanly caught, to avoid spoofing old-style exception handlers.)

RFC:

This means that all exceptions propagate unless they are cleanly caught, just as in Perl 5. To prevent this, use:
    try { fragile(); } catch { } # Go on no matter what.

This will simply be:

    try { fragile; }

But it means the same thing, and it's still the case that all exceptions propagate unless they are cleanly caught. In this case, the caught exception lives on in $! as a new proto-exception that could be rethrown by a new die, much as we used to use $@. Whether an exception is currently considered "cleanly caught" can be reflected in the state of the $! object itself. When $! passes through the end of a CATCH, it is marked as clean, so that subsequent attempts to establish a new $! know that they can clear out the old @$! stack. (If the current $! is not clean, it should just add its information without deleting the old information--otherwise an error in a CATCH could delete the exception information you will soon be wanting to print out.)

RFC:

    try { ... } catch <test> => { ... } finally { ... }

Now:

    { ... CATCH { when <test> { ... } } POST { ... } }

(The angle brackets aren't really there--I'm just copying the RFC's metasyntax here.)

Note that we're assuming a test that matches the "boolean" entry from the switch dwimmery matrix. If not, you can always wrap closure curlies around the test:

    { ... CATCH { when { <test> } { ... } } POST { ... } }

That will force the test to be called as a subroutine that ignores its argument, which happens to be $!, the exception object. (Recall that the implied "given" of a CATCH statement sets $! as the given value. That given value is automatically passed to any "when" cases that look like subroutines or closures, which are free either to ignore the passed value, or access it as $_ or $^a.)

Or you might just prefer to use the unary true operator:

    { ... CATCH { when true <test> { ... } } POST { ... } }

I personally find that more readable than the closure.

RFC:

The test argument of the catch clause is optional, and is described below.

The test argument of a when clause is NOT optional, since it would be impossible to distinguish a conditional closure from the following block. Use default for the default case.

RFC:

try, catch, and finally blocks should share the same lexical scope, in the way that while and continue do.

Actually, this is not so--the while and continue blocks don't share the same lexical scope even in Perl 5. But we'll solve this issue without "tunneling" in any case. (And we'll change the continue block into a NEXT block that goes inside, so we can refer to lexical variables from within it.)

RFC:

Note that try is a keyword, not a function. This is so that a ; is not needed at the end of the last block. This is because a try/catch/finally now looks more like an if/elsif/else, which does not require such a ;, than like an eval, which does).

Again, this entire distinction goes away in Perl 6. Any expression block that terminates with a right curly on its own line will be interpreted as a statement block. And try is such an expression block.

RFC:

$@ contains the current exception, and @@ contains the current exception stack, as defined above under die. The unshift rule guarantees that $@ == $@[0].

Why an unshift? A stack is most naturally represented in the other direction, and I can easily imagine some kinds of handlers that might well treat it like a stack, stripping off some entries and pushing others.

Also, @@ is a non-starter because everything about the current exception should all be in a single data structure. Keeping the info all in one place makes it easy to rethrow an exception without losing data, even if the exception was marked as cleanly caught. Furthermore I don't think that the exception stack needs to be Huffman coded that badly.

So $! contains the current exception, and $!.stack accesses the current exception stack. Through the magic of overloading, the $! object can likely be used as an array even though it isn't one, in which case @$! refers to that stack member. The push rule guarantees that $!.id == $![-1].id.

RFC (speaking of the exception declaration):

If the given name matches /::/, something like this happens:
    @MyError::App::DB::Foo::ISA = 'MyError::App::DB';

and all non-existent parent classes are automatically created as inheriting from their parent, or Exception in the tail case. If a parent class is found to exist and not inherit from Exception, a run-time error exception is raised.

If I understand this, I think I disagree. A package ought to able to contain exceptions without being an exception class itself. There certainly ought to be a shorthand for exceptions within the current package. I suspect they're inner classes of some sort, or inner classes of an inner package, or some such.

RFC:

If the given name does not match /::/ (say it's just Alarm), this happens instead:
    @Alarm::ISA = 'Exception';

This means that every exception class isa Exception, even if Exception:: is not used at the beginning of the class name.

Ack! This could be really bad. What if two different modules declare an Alarm exception with different derivations?

I think we need to say that unqualified exceptions are created within the current package, or maybe within the X subpackage of the current package. If we have inner classes, they could even be lexically scoped (and hence anonymous exceptions outside the current module). That might or might not be a feature.

I also happen to think that Exception is too long a name to prefix most common exceptions, even though they're derived from that class. I think exceptions will be better accepted if they have pithier names like X::Errno that are derived from Exception:

    our class X::Control is Exception;
    our class X::Errno is Exception;
    our class X::NumericError is Exception;
    our class C::NEXT is X::Control;
    our class E::NOSPC is X::Errno;
    our class X::FloatingUnderflow is X::NumericError;

Or maybe those could be:

    c::NEXT
    e::NOSPC
    x::FloatingUnderflow

if we decide uppercase names are too much like user-defined package names. But that looks strange. Maybe we just reserve single letter top-level package names for Perl. Heck, let's just reserve all top-level package names for Perl. Er, no, wait... :-)

RFC 80 suggests that exception objects numerify to the system's errno number when those are available. That's a possibility, though by the current switch rules we might have to write

    CATCH {
        when +$ENOSPC { ... }
    }

to force $ENOSPC to do a numeric comparison. It may well be better to go ahead and make the errno numbers into exception classes, even if we have to write something like this:

    CATCH {
        when X::ENOSPC { ... }
    }

That's longer, but I think it's clearer. Possibly that's E::NOSPC instead. But in any event, I can't imagine getting people to prefix every exception with "Exception::". That's just gonna discourage people from using exceptions. I'm quite willing to at least reserve the X top-level class for exceptions. I think X:: is quite sufficiently distinctive.

RFC:

    try { my $f = open "foo"; ... } finally { $f and close $f; }

Now:

    {
        my $f = open "foo"; ...
        POST { $f and close $f }
    }

Note that $f is naturally in scope and guaranteed to have a boolean value, even if the exception is thrown before the declaration statement is elaborated! (An implementation need not allocate an actual variable before the my. The code of the POST block could always be compiled to know that $f is to be assumed undefined if the allocating code has not yet been reached.)

We could go as far as to make

        POST { close $f }

do something reasonable even without the guard. Maybe an undefined object could "emulate" any method for you within a POST. Maybe try is really a unary operator:

        POST { try close $f }

Or some such. I dunno. This needs more thought along transactional lines...

Time passes...

Actually, now that I've thought on it, it would be pretty easy to put wrappers around POST blocks that could do commit or rollback depending on whether the block exits normally. I'd like to call them KEEP and UNDO. KEEP blocks would only be executed if the block succeeded. UNDO blocks would only be executed if the block failed. One could even envision a syntax that ties the block to particular variable:

    UNDO $f { close $f }

After all, like the CATCH block, all of these blocks are just fancy BEGIN blocks that attach some meaning to some predefined property of the block.

Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9

Next Pagearrow