Apocalypse 5
by Larry Wall
|
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24
Editor's Note: this Apocalypse is out of date and remains here for historic reasons. See Synopsis 05 for the latest information.
Variable Interpretation
As we mentioned earlier, bare scalars match their contents literally.
(Use <$var> instead to match a regex defined in $var.)
Subscripted arrays and hashes behave just like a scalar as long
as the subscripts aren't slices.
If you use a bare array (unsubscripted), it will match if any element of the array matches literally at that point. (A slice of an array or hash also behaves this way.) If you say
@array = ("^", "$", ".");
/ @array /
it's as if you said
/ \^ | \$ | \. /
But if you you slice it like this:
/ @array[0..1] /
it won't match the dot.
If you want the array to be considered as a set of regex alternatives, enclose in angles:
@array = ("^foo$", "^bar$", "^baz$");
/ <@array> /
Bare hashes in a regex provide a sophisticated match-via-lookup mechanism. Bare hashes are matched as follows:
-
Match a key at the current point in the string.
-
1a. If the hash has its
keymatchproperty set to some regex, use that regex to match the key. -
1b. Otherwise, use
/\w+:/to match the key.
-
1a. If the hash has its
- If a key isn't found at the current position in the string, the match fails.
- Otherwise, get the value in the hash corresponding to the matched key.
- If the is no entry for that key, the match fails.
-
If the hash doesn't have a
valuematchproperty, the match succeeds immediately. -
Otherwise use the hash's
valuematchproperty (typically itself a regex) to extract the value at the current point in the string. - If no value can be extracted, matching of the hash fails.
-
If the extracted value string is
eqto the key's actual value, matching of the original hash immediately succeeds. - Otherwise, matching of the original hash fails.
So matching a bare hash is equivalent to:
rule {
$key := <{ %hash.prop{keymatch} // /\w+:/ }> # find key
<( exists %hash{$key} )> # if exists
[ <( not defined %hash.prop{valuematch} )> :: # done?
<null> # succeed
| # else
$val := <%hash.prop{valuematch}> # find value
<( $val eq %hash{$val} )> # assert eq
]
}
A typical valuematch might look like:
rule {
\s* =\> \s* # match =>
$q:=(<["']>) # match initial quote
$0:=( [ \\. | . ]*? ) # return matched value
$q # match trailing quote
}
In essence, the presence or absence of the valuematch property
controls whether the hash tries to match only keys, or both keys
and values.
A hash may be used inside angles as well. In that case, it finds the key by the same method (steps 1 and 2 above), but always treats the corresponding hash value as a regex (regardless of any properties the hash might have). The parse then continues according to the rule found in the hash. For example, we could parse a set of control structures with:
rule { <%controls> }
The %controls hash can have keys like "if" and "while" in it. The
corresponding entry says how to parse the rest of an if or a
while statement. For example:
%controls = (
if => / <condition> <closure> /,
unless => / <condition> <closure> /,
while => / <condition> <closure> /,
until => / <condition> <closure> /,
for => / <list_expr> <closure> /,
loop => / <loop_controls>? <closure> /,
);
So saying:
<%controls>
is really much as if we'd said:
[ if \b <%controls{if}>
| unless \b <%controls{unless}>
| while \b <%controls{while}>
| until \b <%controls{until}>
| for \b <%controls{for}>
| loop \b <%controls{loop}>
]
Only it actually works more like
/ $k=<{ %controls.prop{keymatch} // /\w+:/ }> <%controls{$k}> /
Note that in Perl 6 it's perfectly valid to use // inside an expression embedded in a regex
delimited by slashes. That's because a regex is no longer considered
a string, so we don't have to find the end of it before we parse it.
Since we can parse it in one pass, the expression parser can handle
the // when it gets to it without worrying about the outer slash, and
the final slash is recognized as the terminator by the regex parser
without having to worry about anything the expression parser saw.
A bare subroutine call may be used in a regex, provided it starts with
& and uses parentheses around the arguments. The return value of the
subroutine is matched literally. The subroutine may have side effects,
and may throw an exception to fail.
Pages: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 |

