Exegesis 4
by Damian Conway
|
Pages: 1, 2, 3, 4, 5, 6
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 4 for the current design information.
Other whens
The remaining cases of the data look-up are handled by subsequent when statements. The first:
when 'previous' { return %var{""} // fail NoData }
handles the special keyword "previous". The previous value is always stored
in the element of %var whose key is the empty string.
If, however, that previous value is undefined, then the defaulting operator -- // -- causes the right-hand side of the expression to be evaluated instead.
That right-hand side is a call to the fail method of class NoData
(and could equally have been written NoData.fail()).
The standard fail method inherited from the Exception class constructs an instance of the appropriate class (i.e. an exception object) and then either throws that exception (if the use fatal pragma is in effect) or else returns an undef value from the scope in which the fail was invoked. That is, the fail acts like a die SomeExceptionClass or a return undef, depending on the state of the use fatal pragma.
This is possible because, in Perl 6, all flow-of-control -- including the normal subroutine return -- is exception-based. So, when it is supposed to act like a return, the Exception::fail method simply throws the special Ctl::Return exception, which get_data's caller will (automagically) catch and treat as a normal return.
So then why not just write the usual:
return undef;
instead?
The advantage of using fail is that it allows the
callers of get_data to decide how that subroutine should signal failure. As explained above, normally fail fails by returning undef. But if a use fatal pragma is in effect, any invocation of fail instead throws the corresponding exception.
What's the advantage in that? Well, some people feel that certain types of failures ought to be taken deadly seriously (i.e. they should kill you unless you explicitly catch and handle them). Others feel that the same errors really aren't all that serious and you should be allowed to, like, chill man and just groove with the heavy consequences, dude.
The fail method allows you, the coder, to stay well out of that kind of fruitless religious debate.
When you use fail to signal failure, not only is the code nicely
documented at that point, but the mode of failure becomes caller-selectable.
Fanatics can use fatal and make each failure punishable by death;
hippies can say no fatal and make each failure just return undef.
You no longer have to get caught up in endless debate as to whether the exception-catching:
try { $data = get_data($str) }
// warn "Couldn't get data" }
is inherently better or worse than the undef-sensing:
do { $data = get_data($str) }
// warn "Couldn't get data";
Instead, you can just write get_data such that There's More Than One Way To Fail It.
By the way, fail can fail in other ways, too: in different contexts or under different pragmas. The most obvious example would be inside a regex, where it would initiate back-tracking. More on that in Apocalypse 5.
Still Other Whens
Meanwhile, if $data isn't a number or the "previous" keyword, then maybe it's
the name of one of the calculator's variables. The third when statement of the switch tests for that:
when %var { return %var{""} = %var{$_} }
If a when is given a hash, then it uses the current topic as a key in the hash and looks up the corresponding entry. If that value is true, then it executes its block. In this case, that block caches the value that was looked up (i.e. %var{$_}) in the "previous" slot and returns it.
"Aha!" you say, "that's a bug! What if the value of %var{$_} is false?!"
Well, if it were possible for that to ever happen, then it certainly would be a bug, and we'd have to write something ugly:
when defined %var{$_} { return %var{""} = %var{$_} }
But, of course, it's much easier just to redefine Truth, so that any
literal zero value stored in %var is no longer false. See below.
Finally, if the $data isn't a literal, then a "previous", or a variable name,
it must be an invalid token, so the default alternative in the switch statement throws an Err::BadData exception:
default { die Err::BadData : msg=>"Don't understand $_" }
Note that, here again, we are actually executing a method call to:
Err::BadData.die(msg=>"Don't understand $_");
as indicated by the use of the colon after the classname.
Of course, by using die instead of fail here, we're giving clients of the get_data subroutine no choice but to deal with Err::BadData exceptions.
An Aside: the "Smart Match" Operator
The rules governing how the argument of a when is matched against the
current topic are designed to be as DWIMish as possible. Which means that
they are actually quite complex. They're listed in Apocalypse 4, so we won't review them here.
Collectively, the rules are designed to provide a generic "best attempt at matching" behavior. That is, given two values (the current topic and the when's first argument), they try to determine whether those values can be combined to produce a "smart match" -- for some reasonable definitions of "smart" and "match."
That means that one possible use of a Perl 6 switch statement is simply to test whether two values match without worrying about how those two values match:
sub hey_just_see_if_dey_match_willya ($val1, $val2) {
given $val1 {
when $val2 { return 1 }
default { return 0 }
}
}
That behavior is sufficiently useful that Larry wanted to make it much easier to use. Specifically, he wanted to provide a generic "smart match" operator.
So he did. It's called =~.
Yes, the humble Perl 5 "match a string against a regex" operator is promoted in Perl 6 to a "smart-match an anything against an anything" operator. So now:
if ($val1 =~ $val2) {...}
works out the most appropriate way to compare its two scalar operands. The result might be a numeric comparison ($val1 == $val2) or a string comparison ($val1 eq $val2) or a subroutine call ($val1.($val2))
or a pattern match ($val1 =~ /$val2/) or whatever else makes the most
sense for the actual run-time types of the two operands.
This new turbo-charged "smart match" operator will also work on arrays, hashes and lists:
if @array =~ $elem {...} # true if @array contains $elem
if $key =~ %hash {...} # true if %hash{$key}
if $value =~ (1..10) {...} # true if $value is in the list
if $value =~ ('a',/\s/,7) {...} # true if $value is eq to 'a'
# or if $value contains whitespace
# or if $value is == to 7
That final example illustrates some of the extra intelligence that Perl 6's =~ has: When one of its arguments is a list (not an array), the "smart match" operator recursively "smart matches" each element
and ORs the results together, short-circuiting if possible.
Being Calculating
The next component of the program is the subroutine that computes the actual results of each expression that the user enters. It takes a string to be evaluated and an integer indicating the current iteration number of the main input loop (for debugging purposes):
sub calc (str $expr, int $count) {
Give us a little privacy, please
Perl 5 has a really ugly idiom for creating "durable" lexical variables: variables that are lexically scoped but stick around from call to call.
If you write:
sub whatever {
my $count if 0;
$count++;
print "whatever called $count times\n";
}
then the compile-time aspect of a my $count declaration causes $count to be declared as a lexical in the subroutine block. However, at run-time
-- when the variable would normally be (re-)allocated -- the if 0
prevents that process. So the original lexical variable is not
replaced on each invocation, and is instead shared by them all.
This awful if 0 idiom works under most versions of Perl 5,
but it's really just a freakish accident of Perl's evolution,
not a carefully designed and lovingly crafted feature. So
just say "No!".
Perl 6 allows us to do the same thing, but without feeling the need to wash afterward.
To understand how Perl 6 cleans up this idiom, notice that the durable variable is really much more; like a package variable that just happens to be accessible only in a particular lexical scope. That kind of restricted-access package variable is going to be quite common in Perl 6 -- as an attribute of a class.
So the way we create such a variable is to declare it as a package variable, but with the is private property:
module Wherever;
sub whatever {
our $count is private;
$count++;
print "whatever called $count times\n";
}
Adding is private causes Perl to recognize the existence of the variable $count within the Wherever module, but then to restrict its accessibility to the lexical scope in which it is first declared. In the above example, any attempt to refer to $Wherever::count outside the &Wherever::whatever subroutine produces a compile-time error. It's still a package variable, but now you can't use it anywhere but in the nominated lexical scope.
Apart from the benefit of replacing an ugly hack with a clean explicit marker on the variable, the real advantage is that Perl 6 private variables can be also be initialized:
sub whatever {
our $count is private //= 1;
print "whatever called $count times\n";
$count++;
}
That initialization is performed the first time the variable declaration
is encountered during execution (because that's the only time its value is undef, so that's the only time the //= operator has any effect).
In our example program we use that facility to do a one-time-only initialization of a private package hash. That hash will then be used as a (lexically restricted) look-up table to provide the implementations for a set of operator symbols:
our %operator is private //= (
'*' => { $^a * $^b },
'/' => { $^a / $^b },
'~' => { ($^a + $^b) / 2 },
);
Each key of the hash is an operator symbol and the corresponding value
is an anonymous subroutine that implements the appropriate operation.
Note the use of the "place-holder" variables ($^a and $^b) to
implicitly specify the parameters of the closures.
Since all the data for the %operator hash is constant, we could have
achieved a similar effect with:
my %operator is constant = (
'*' => { $^a * $^b },
'/' => { $^a / $^b },
'~' => { ($^a + $^b) / 2 },
);
Notionally this is quite different from the is private version, in
that -- theoretically -- the lexical constant would be reconstructed and reinitialized on each invocation of the calc subroutine. Although, in practice, we would expect the compiler to notice the constant initializer and optimize the initialization out to compile-time.
If the initializer had been a run-time expression, then the is private
and is constant versions would behave very differently:
our %operator is private //= todays_ops(); # Initialize once, the first
# time statement is reached.
# Thereafter may be changed
# at will within subroutine.
my %operator is constant = todays_ops(); # Re-initialize every time
# statement is reached.
# Thereafter constant
# within subroutine
Let's Split!
We then have to split the input expression into (whitespace-delimited) tokens, in order to parse and execute it. Since the calculator language we're implementing is RPN, we need a stack to store data and interim calculations:
my @stack;
We also need a counter to track the current token number (for error messages):
my $toknum = 1;
Then we just use the standard split built-in to break up the expression
string, and iterate through each of the resulting tokens using a for loop:
for split /\s+/, $expr -> $token {
There are several important features to note in this for loop. To begin with,
there are no parentheses around the list. In Perl 6, they are not required
(they're not needed for any control structure), though they are certainly
still permissible:
for (split /\s+/, $expr) -> $token {
More importantly, the declaration of the iterator variable ($token) is no
longer to the left of the list:
# Perl 5 code
for my $token (split /\s+/, $expr) {
Instead, it is specified via a topical arrow to the right of the list.
By the way, somewhat surprisingly, the Perl 6 arrow operator isn't a binary operator. (Actually, neither is the Perl 5 arrow operator, but that's not important right now.)
Even more surprisingly, what the Perl 6 arrow operator is, is a synonym
for the declarator sub. That's right, in Perl 6 you can declare an
anonymous subroutine like so:
$product_plus_one = -> $x, $y { $x*$y + 1 };
The arrow behaves like an anonymous sub declarator:
$product_plus_one = sub($x, $y) { $x*$y + 1 };
except that its parameter list doesn't require parentheses. That implies:
-
The Perl 6
for,while,if, andgivenstatements each take two arguments: an expression that controls them and a subroutine/closure that they execute. Normally, that closure is just a block (in Perl6 all blocks are really closures):for 1..10 { # no comma needed before opening brace print }but you can also be explicit:
for 1..10, sub { # needs comma if a regular anonymous sub print }or you can be pointed:
for 1..10 -> { # no comma needed with arrow notation print }or referential:
for 1..10, # needs comma if a regular sub reference &some_sub;- The variable after the arrow is effectively a lexical variable confined to the scope of the following block (just as a subroutine parameter is a lexical variable confined to the scope of the subroutine block). Within the block, that lexical becomes an alias for the topic (just as a subroutine parameter becomes an alias for the corresponding argument).
-
Topic variables created with the arrow notation are, by default, read-only aliases (because Perl 6 subroutine parameters are, by default, read-only aliases):
for @list -> $i { if ($cmd =~ 'incr') { $i++; # Error: $i is read-only } }Note that the rule doesn't apply to the default topic (
$_), which is given special dispensation to be a modifiable alias (as in Perl 5). -
If you want a named topic to be modifiable through its alias, then you have to say so explicitly:
for @list -> $i is rw { if ($cmd =~ 'incr') { $i++; # Okay: $i is read-write } } -
Just as a subroutine can have more than one parameter, so too we can specify more than one named iterator variable at a time:
for %phonebook.kv -> $name, $number { print "$name: $number\n" }Note that in Perl 6, a hash in a list context returns a list of pairs, not the Perl 5-ish "key, value, key, value, ..." sequence. To get the hash contents in that format, we have to call the hash's
kvmethod explicitly.What actually happens in this iteration (and, in fact, in all such instances) is that the
forloop looks at the number of arguments its closure takes and iterates that many elements at a time.Note that
mapandreducecan do that too in Perl 6:# process @xs_and_ys two-at-a-time... @list_of_powers = map { $^x ** $^y } @xs_and_ys; # reduce list three-at-a-time $sum_of_powers = reduce { $^partial_sum + $^x ** $^y } 0, @xs_and_ys;And, of course, since
mapandreducetake a subroutine reference as their first argument -- instead of using the higher-order placeholder notation -- we could use the arrow notation here too:@list_of_powers = map -> $x, $y { $x ** $y } @xs_and_ys;or even an old-fashioned anonymous subroutine:
@list_of_powers = map sub($x,$y){ $x ** $y }, @xs_and_ys;
Phew. If that all makes your head hurt, then don't worry. All you really need to remember is this: If you don't want to use
$_as the name of the current topic, then you can change it by putting an arrow and a variable name before the block of most control statements.

