Exegesis 7
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
What a piece of work is Perl 6!
How noble in reason!
How infinite in faculty!
In form how express and admirable!
– W. Shakespeare, "Hamlet" (Perl 6 revision)
Formats are Perl 5's mechanism for creating text templates with fixed-width fields. Those fields are then filled in using values from prespecified package variables. They're a useful tool for generating many types of plaintext reports – the r in Perl, if you will.
Unlike Perl 5, Perl 6 doesn't have a format keyword. Or the
associated built-in formatting mechanism. Instead it has a Form.pm
module. And a form function.
Like a Perl 5 format statement, the form function takes a series
of format (or "picture") strings, each of which is immediately
followed by a suitable set of replacement values. It interpolates
those values into the placeholders specified within each picture string,
and returns the result.
The general idea is the same as for Perl's two other built-in string
formatting functions: sprintf and pack. The first argument
represents a template with N placeholders to be filled in, and the
next N arguments are the data that is to be formatted and
interpolated into those placeholders:
$text = sprintf $format_s, $datum1, $datum2, $datum3;
$text = pack $format_p, $datum1, $datum2, $datum3;
$text = form $format_f, $datum1, $datum2, $datum3;
Of course, these three functions use quite different mini-languages to specify the templates they fill in, and all three fill in those templates in quite distinct ways.
Apart from those differences in semantics, form has a syntactic
difference too. With form, after the first N data arguments we're
allowed to put a second format string and its corresponding data, then a
third format and data, and so on:
$text = form $format_f1, $datum1, $datum2, $format_f2, $datum4, $format_f3, $datum5;
And if we prettify that function call a little, it becomes obvious that it has
the same basic structure as a Perl 5 format:
form
$format_f1,
$datum1, $datum2, $datum3,
$format_f2,
$datum4,
$format_f3,
$datum5;
But the Perl 6 version is implemented as a vanilla Perl 6 subroutine,
rather than hard-coded into the language with a special keyword and
declaration syntax. In this respect it's rather like Perl 5's
little-known formline function – only much, much better.
So, whereas in Perl 5 we might write:
# Perl 5 code...
our ($name, $age, $ID, $comments);
format STDOUT
===================================
| NAME | AGE | ID NUMBER |
|----------+------------+-----------|
| @<<<<<<< | @||||||||| | @>>>>>>>> |
$name, $age, $ID,
|===================================|
| COMMENTS |
|-----------------------------------|
| ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |~~
$comments,
===================================
.
write STDOUT;
in Perl 6 we could write:
print form
" =================================== ",
"| NAME | AGE | ID NUMBER |",
"|----------+------------+-----------|",
"| {<<<<<<} | {||||||||} | {>>>>>>>} |",
$name, $age, $ID,
"|===================================|",
"| COMMENTS |",
"|-----------------------------------|",
"| {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[} |",
$comments,
" =================================== ";
And both of them would print something like:
===================================
| NAME | AGE | ID NUMBER |
|----------+------------+-----------|
| Richard | 33 | 000003 |
|===================================|
| COMMENTS |
|-----------------------------------|
| Talks to self. Seems to be |
| overcompensating for inferiority |
| complex rooted in post-natal |
| materal rejection due to physical |
| handicap (congenital or perhaps |
| the result of premature birth). |
| Shows numerous indications of |
| psychotic (esp. nepocidal) |
| tendencies. Naturally, subject |
| gravitated to career in politics. |
===================================
At first glance the Perl 6 version may seem like something of a backwards step – all those extra quotation marks and commas that the Perl 5 format didn't require. But the new formatting interface does have several distinct advantages:
'@', '^', '~', and '.' in
formats, leaving only '{' as special;
eval);
form to be nested;
form;
write function – and hence frees up write to be
used as the true opposite of read, should Larry so desire.
Of course, this is Perl, not Puritanism. So those folks who happen to like package variables, global accumulators, and mysterious writes, can still have them. And, if they're particularly nostalgic, they can also get rid of all the quotation marks and commas, and even retain the dot as a format terminator. For example:
sub myster_rite {
our ($name, $age, $ID, $comments);
print form :interleave, <<'.'
===================================
| NAME | AGE | ID NUMBER |
|----------+------------+-----------|
| {<<<<<<} | {||||||||} | {>>>>>>>} |
|===================================|
| COMMENTS |
|-----------------------------------|
| {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[} |
===================================
.
$name, $age, $ID,
$comments;
}
# and elsewhere in the same package...
($name, $age, $ID, $comments) = get_data();
myster_rite();
($name, $age, $ID, $comments) = get_more_data();
myster_rite();
Let's take a look...
But before we do, here's a quick run-down of some of the highly arcane technical jargon we'll be using as we talk about formatting:
form returns.
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
Unlike sprintf and pack, the form subroutine isn't built into Perl 6.
It's just a regular subroutine, defined in the Form.pm module:
module Form
{
type FormArgs ::= Str|Array|Pair;
sub form (FormArgs *@args is context(Scalar)) returns Str
is exported
{
...
}
...
}
That means that if we want to use form we need to be sure we:
use Form;
first.
Note that the above definition of form specifies that the subroutine takes
a list of arguments (*@args), each of which must be a string, array
or pair (type FormArgs ::= Str|Num|Array|Pair). And the
is trait specifies that each of those arguments will be
evaluated in a scalar context.context
That last bit is important, because normally a "slurpy" array parameter like
*@args would impose a list context on the corresponding arguments. We don't
want that here, mainly because we're going to want to be able to pass arrays to form without
having them flattened.
Like all Perl subroutines, form can be called in a variety of contexts.
When called in a scalar or list context, form returns a string
containing the complete formatted text:
my $formatted_text = form $format, *@data;
@texts = ( form($format, *@data1), form($format, *@data2) ); # 2 elems
When called in a void context, form waxes lyrical about human
frailty, betrayal of trust, and the pointlessness of calling out
when nobody's there to heed the reply, before dying in a highly
theatrical manner.
The format strings passed to form determine what the resulting
formatted text looks like. Each format consists of a series
of field specifiers, which are usually separated by literal characters.
form understands a far larger number of field specifiers than format did,
but they're easy to remember because they obey a small number of conventions:
< or >), bars
(|), and single-quotes (') indicate various types of single-line fields.
[ or ]), I's (I), and double-
quotes (") indicate block fields of various types.
{<<<<<<<<<<<} Justify the text to the left
{>>>>>>>>>>>} Justify the text to the right
{>>>>>><<<<<} Centre the text
{<<<<<<>>>>>} Fully justify the text to both margins
This is even true for numeric fields, which look like:
{>>>>>.<<}. The whole digits are right-justified before
the dot and the decimals are left-justified after it.
= at either end of a field (or both ends) indicates the data
interpolated into the field is to be vertically "middled" within the
resulting block. That is, the text is to be centred vertically on the
middle of all the lines produced by the complete format.
_ at the start and/or end of a field indicates the interpolated data
is to be vertically "bottomed" within the resulting block. That is, the
text is to be pushed to the bottom of the lines produced by the format.That may still seem like quite a lot to remember, but the rules have
been chosen so that the resulting fields are visually mnemonic. In other
words, they're supposed to look like what they do. The intention is that
we simply draw a (stylized) picture of how we want the finished text to
look, using fields that look something like the finished product
– left or right brackets brackets showing horizontal alignments,
a middlish = or bottomed-out _ indicate middled or bottom vertical
alignment, etc., etc. Then form fits our data into the fields so it
looks right.
The typical field specifications used in a form format look like this:
Field specifier
Field type One-line Block
========== ========== ==========
left justified {<<<<<<<<} {[[[[[[[[}
right justified {>>>>>>>>} {]]]]]]]]}
centred {>>>><<<<} {]]]][[[[}
centred (alternative) {||||||||} {IIIIIIII}
fully justified {<<<<>>>>} {[[[[]]]]}
verbatim {''''''''} {""""""""}
numeric {>>>>>.<<} {]]]]].[[}
euronumeric {>>>>>,<<} {]]]]],[[}
comma'd {>,>>>,>>>.<<} {],]]],]]].[[}
space'd {> >>> >>>.<<} {] ]]] ]]].[[}
eurocomma'd {>.>>>.>>>,<<} {].]]].]]],[[}
Swiss Army comma'd {>'>>>'>>>,<<} {]']]]']]],[[}
subcontinental {>>,>>,>>>.<<} {]],]],]]].[[}
signed numeric {->>>.<<<} {-]]].[[[}
post-signed numeric {>>>>.<<-} {]]]].[[-}
paren-signed numeric {(>>>.<<)} {(]]].[[)}
prefix currency {$>>>.<<<} {$]]].[[[}
postfix currency {>>>.<<<DM} {]]].[[[DM}
infix currency {>>>$<< Esc} {]]]$[[ Esc}
left/middled {=<<<<<<=} {=[[[[[[=}
right/middled {=>>>>>>=} {=]]]]]]=}
infix currency/middled {=>>$<< Esc} {=]]$[[ Esc}
eurocomma'd/middled {>.>>>.>>>,<<=} {].]]].]]],[[=}
etc.
left/bottomed {_<<<<<<_} {_[[[[[[_}
right/bottomed {_>>>>>>_} {_]]]]]]_}
etc.
When data is interpolated into a line field, the field grabs as much of the data as will fit on a single line, formats that data appropriately, and interpolates it into the format.
That means that if we use a one-line field, it only shows as much of the data as will fit on one line. For example:
my $data1 = 'By the pricking of my thumbs, something wicked this way comes';
my $data2 = 'A horse! A horse! My kingdom for a horse!';
print form
"...{<<<<<<<<<<<<<<<<<}...{>>>>>>>}...",
$data1, $data2;
prints:
...By the pricking of ... A horse!...
On the other hand, if our format string used block fields instead, the fields would extract one line of data at a time, repeating that process as many times as necessary to display all the available data. So:
print form
"...{[[[[[[[[[[[[[[[[[}...{]]]]]]]}...",
$data1, $data2;
would produce:
...By the pricking of ... A horse!...
...my thumbs, ... A horse!...
...something wicked ... My...
...this way comes ... kingdom...
... ... for a...
... ... horse!...
We can mix line fields and block fields in the same format and form will
extract and interpolate only as much data as each field requires. For example:
print form
"...{<<<<<<<<<<<<<<<<<}...{]]]]]]]}...",
$data1, $data2;
which produces:
...By the pricking of ... A horse!...
... ... A horse!...
... ... My...
... ... kingdom...
... ... for a...
... ... horse!...
Notice that, after the first line, the single-line
{<<<<<<} field is simply replaced by
the appropriate number of space
characters, to keep the columns correctly aligned.
The usual reason for mixing line and block fields in this way is to allow numbered or bulleted points:
print "I couldn't do my English Lit homework because...\n\n";
for @reasons.kv -> $index, $reason {
my $n = @reasons - $index ~ '.';
print form " {>} {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[}",
$n, $reason,
"";
}
which might produce:
I couldn't do my English Lit homework because...
10. Three witches told me I was going to be
king.
9. I was busy explaining wherefore am I Romeo.
8. I was busy scrubbing the blood off my
hands.
7. Some dear friends had to charge once more
unto the breach.
6. My so-called best friend tricked me into
killing my wife.
5. My so-called best friend tricked me into
killing Caesar.
4. My so-called best friend tricked me into
taming a shrew.
3. My uncle killed my father and married my
mother.
2. I fell in love with my manservant, who was
actually the disguised twin sister of the
man that my former love secretly married,
having mistaken him for my manservant who
was wooing her on my behalf whilst secretly
in love with me.
1. I was abducted by fairies.
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
Obviously, as a call to form builds up each line of its output
– extracting data from one or more data arguments and
formatting it into the corresponding fields – it needs to keep
track of where it's up to in each datum. It does this by progressively
updating the .pos of each datum, in exactly the same way as a
pattern match does.
And as with a pattern match, by default that updated .pos is only
used internally and not preserved after the call to form is
finished. So passing a string to form doesn't interfere with any
other pattern matching or text formatting that we might
subsequently do with that data.
However, sometimes we do want to know how much of our data a call to form
managed to extract and format. Or we may want to split a formatting task
into several stages, with separate calls to form for each stage.
So we need a way of telling form to preserve the .pos information
in our data.
But, if we want to apply a series of form calls to the same data we also
need to be able to tell form to respect the .pos information
of that data – to start extracting from the previously preserved
.pos position, rather than from the start of the string.
To achieve both those goals, we use a follow-on field. That is we use
an ordinary field but mark it as .pos-sensitive with a special
notation: Unicode ellipses or ASCII colons at either end. So instead of
{<<<<>>>>}, we'd write
{…<<<>>>…}
or {:<<<>>>:}.
Note that each ellipsis is a single, one-column wide Unicode HORIZONTAL
ELLIPSIS character (\c[2026]), not three separate dots. The
connotation of the ellipses is "...then keep on formatting from where
you previously left off, remembering there's probably still more to
come...". And the colons are the ASCII symbol most like a single
character ellipsis (try tilting your head and squinting).
Follow-on fields are most useful when we want to split a formatting task into distinct stages – or iterations – but still allow the contents of the follow-on field to flow uninterrupted from line to line. For example:
print "The best Shakespearean roles are:\n\n";
for @roles -> $role {
print form " * {<<<<<<<<<<<<<<<<<<<<<<<<<<<<} *{…<<<<<<<>>>>>>>…}*",
$role, $disclaimer;
}
which produces:
The best Shakespearean roles are:
* Macbeth *WARNING: *
* King Lear *This list of roles*
* Juliet *constitutes a*
* Othello *personal opinion*
* Hippolyta *only and is in no*
* Don John *way endorsed by*
* Katerina *Shakespeare'R'Us. *
* Richard *It may contain*
* Malvolio *nuts. *
* Bottom * *
The multiple calls to form manage to produce a coherent disclaimer
because the ellipses in the second field tell each call to start
extracting data from $disclaimer at the offset indicated by
$disclaimer.pos, and then to update $disclaimer.pos with
the final position at which the field extracted data. So the next time
form is called, the follow-on field starts extracting from
where it left off in the previous call.
Follow-on fields are similar to ^<<<<< fields in a Perl 5 format,
except they don't destroy the contents of a data source; they merely change that
data source's .pos marker.
Data, especially numeric data, is often stored in arrays.
So form also accepts arrays as data arguments. It can
do so because its parameter list is defined as:
sub form (Str|Array|Pair *@args is context(Scalar)) {...}
which means that although its arguments may include one or more arrays, each such array argument is nevertheless evaluated in a scalar context. Which, in Perl 6, produces an array reference.
In other words, array arguments don't get flattened automatically, so
form doesn't losing track of where in
the argument list one array finishes and the next begins.
Once inside form, each array that was specified as the data source
for a field is internally converted to a single string by joining it
together with a newline between each element.
The upshot is that, instead of:
print "The best Shakespearean roles are:\n\n";
for @roles -> $role {
print form " * {<<<<<<<<<<<<<<<<<<<<<<<<<<<<} *{…<<<<<<<>>>>>>>…}*",
$role, $disclaimer;
}
we could just write:
print "The best Shakespearean roles are:\n\n";
print form " * {[[[[[[[[[[[[[[[[[[[[[[[[[[[[} *{[[[[[[[[]]]]]]]]}*",
@roles, $disclaimer;
And the array of roles would be internally converted to a single string, with one role per line. Note that we also changed the disclaimer field to a regular block field, so that the entire disclaimer would be formatted. And there was no longer any need for the disclaimer field to be a follow-on field, since the block field would extract and format the entire disclaimer anyway.
Note, however, that this block-based approach wouldn't work so well if
one of the elements of @roles was too big to fit on a single line. In
that case we might end up with something like the following:
The best Shakespearean roles are:
* Either of the 'two foolish *WARNING: *
* officers': Dogberry and Verges *This list of roles*
* That dour Scot, the Laird *constitutes a*
* Macbeth *personal opinion*
* The tragic Moor of Venice, *only and is in no*
* Othello *way endorsed by*
* Rosencrantz's good buddy *Shakespeare'R'Us. *
* Guildenstern *It may contain*
* The hideous and malevolent *nuts. *
* Richard III * *
rather than:
The best Shakespearean roles are:
* Either of the 'two foolish *WARNING: *
officers': Dogberry and Verges *This list of roles*
* That dour Scot, the Laird *constitutes a*
Macbeth *personal opinion*
* The tragic Moor of Venice, *only and is in no*
Othello *way endorsed by*
* Rosencrantz's good buddy *Shakespeare'R'Us. *
Guildenstern *It may contain*
* The hideous and malevolent *nuts. *
Richard III * *
That's because the "*" that's being used as a bullet for the first
column is a literal (i.e. mere decoration),
and so it will be repeated on every line that
is formatted, regardless of whether that line is the start of a new
element of @roles or merely the broken-and-wrapped remains of the
previous element. Happily, as we shall see later, this particular
problem has a simple solution.
Despite these minor complications, array data sources are particularly useful when formatting, especially if the data is known to fit within the specified width. For example:
print form
'-------------------------------------------',
'Name Score Time | Normalized',
'-------------------------------------------',
'{[[[[[[[[[[[[} {III} {II} | {]]].[[} ',
@name, @score, @time, [@score »/« @time];
is a very easy way to produce the table:
-------------------------------------------
Name Score Time | Normalized
-------------------------------------------
Thomas Mowbray 88 15 | 5.867
Richard Scroop 54 13 | 4.154
Harry Percy 99 18 | 5.5
Note the use of the Perl6-ish listwise division (»/«)
to produce the array of data for the "Normalized" column.
The most commonly used fields are those that justify their contents: to the left, to the right, to the left and right, or towards the centre.
Left-justified and right-justified fields extract from their data source the largest substring that will fit inside them, push that string to the left or right as appropriate, and then pad the string out to the required field width with spaces (or the nominated fill character).
Centred fields ({>>>><<<<} and {]]]][[[[}) likewise
extract as much data as possible, and then pad both sides of it with
(near) equal numbers of spaces. If the amount of padding required is not
evenly divisible by 2, the one extra space is added after the data.
There is a second syntax for centred fields – a tip-o'-the-hat to
Perl 5 formats: {|||||||||} and {IIIIIIII}. This variant also
makes it easier to specify centering fields that are only three columns
wide: {|} and {I}.
Note, however, that the behaviour of centering fields specified this way is exactly the same in every respect as the bracket-based versions, so we're free to use whichever we prefer.
Fully justified fields ({<<<<>>>>} and {[[[[]]]]})
extract a maximal substring and then distribute any padding as evenly as
possible into the existing whitespace gaps in that data. For example:
print form '({<<<<<<<<<>>>>>>>>>>>})',
"A fellow of infinite jest, of most excellent fancy";
would print:
(A fellow of infinite)
A fully-justified block field ({[[[[]]]]}) does the same across
multiple lines, except that the very last line is always left-justified.
Hence, this:
print form '({[[[[[[[[]]]]]]]})',
"All the world's a stage, And all the men and women merely players."
would print:
(All the world's a)
(stage, And all)
(the men and women)
(merely players. )
By the way, with both centred fields ({>>>><<<}) and fully
justified fields ({<<<>>>>}), the actual number of
left vs right arrows is irrelevant, so long as there is at least
one of each.
One special case we need to consider is an empty set of field delimiters:
form 'ID number: {}'
This specification is treated as a two-column-wide, left-justified block field (since that seems to be the type of two-column-wide field most often required).
Other kinds of two-column (and single-column) fields can also be created using imperative field widths and and user-defined fields.
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
A field specifier of the form {>>>>.<<} or {]]]].[[}
represents a decimal-aligned numeric field. The decimal marker always
appears in exactly the position indicated and the rest of the number is
aligned around it. The decimal places are rounded to the specific number
of places indicated, but only "significant" digits are shown. For example:
@nums = (1, 1.2, 1.23, 11.234, 111.235, 1.0001);
print form "Thy score be: {]]]].[[}",
@nums;
prints:
Thy score be: 1.0
Thy score be: 1.2
Thy score be: 1.23
Thy score be: 11.234
Thy score be: 111.235
Thy score be: 1.000
The points are all aligned, the minimal number of decimal places are
shown, and the decimals are rounded (using the same rounding protocol that
printf employs). Note in particular that, even though both 1 and
1.0001 would normally convert to the same 3-decimal-place value
(1.000), a form call only shows all three zeros in the second case
since only in the second case are they "significant".
In other words, unless we tell it otherwise,
form tries to avoid displaying a
number with more accuracy than it actually possesses (within the
constraint that it must always show at least one decimal place).
You're probably wondering what happens if we try to format a number that's too
large for the available places (as 123456.78 would be in the above format).
Whereas sprintf would extend a numeric field to accommodate the number,
form insists on preserving the specified layout; in particular, the
position of the decimal point. But it obviously can't just cut off the
extra high-order digits; that would change the value:
Thy score be: 23456.78
So, instead, it indicates that the number doesn't fit by filling the field with octothorpes (the way many spreadsheets do):
Thy score be: #####.###
Note, however, that it is possible to change this behaviour should we need to.
It's also possible that someone (not you, of course!) might attempt to pass a numeric field some data that isn't numeric at all:
my @mixed_data = (1, 2, "three", {4=>5}, "6", "7-Up");
print form 'Thy score be: {]]]].[[}',
@mixed_data;
Unlike Perl itself, form doesn't autoconvert non-numeric values.
Instead it marks them with another special string, by filling the field with
question-marks:
Thy score be: 1.0
Thy score be: 2.0
Thy score be: ?????.???
Thy score be: ?????.???
Thy score be: 6.0
Thy score be: ?????.???
Note that strings per se aren't a problem – form will happily
convert strings that contain valid numbers, such as "6" in the above
example. But it does reject strings that contain anything else besides
a number (even when Perl itself would successfully convert the number
– as it would for "7-Up" above).
Those who'd prefer Perl's usual, more laissez-faire attitude to
numerical conversion can just pre-numerify the values
themselves using the unary numerification operator (shown here in its
list form – +« – since we have an array of
values to be numerified):
print form 'Thy score be: {]]]].[[}',
+« @mixed_data;
This version would print:
Thy score be: 1.0
Thy score be: 2.0
Thy score be: 0.0
Thy score be: 1.0
Thy score be: 6.0
Thy score be: 7.0
(The 1.0 on the fourth line appears because Perl 6 hashes numerify to the
number of entries they contain).
Of course, not everyone uses a dot for their decimal point. The other main
contender is the comma, and naturally form supports that as well. If
we specify a numeric field with a comma between the brackets:
@les_nums = (1, 1.2, 1.23, 11.234, 111.235, 1.0001);
print form 'Votre score est: {]]]],[[}',
@les_nums;
the call prints:
Votre score est: 1,0
Votre score est: 1,2
Votre score est: 1,23
Votre score est: 11,234
Votre score est: 111,235
Votre score est: 1,000
In fact, form is extremely flexible about the characters
we're allowed to use as
a decimal marker: anything except an angle- or square bracket or
a plus sign is acceptable.
As a bonus, form allows us to use the specified decimal marker in
the data as well as in the format. So this works too:
@les_nums = ("1", "1,2", "1,23", "11,234", "111,235", "1,0001");
print form 'Vos score est: {]]]],[[}',
@les_nums;
Negative numbers work as expected, with the minus sign taking up one column of the field's allotted span:
@nums = ( 1, -1.2, 1.23, -11.234, 111.235, -12345.67);
print form 'Thy score be: {]]]].[[}',
@nums;
This would print:
Thy score be: 1.0
Thy score be: -1.2
Thy score be: 1.23
Thy score be: -11.234
Thy score be: 111.235
Thy score be: #####.###
However, form can also format numbers so that the minus sign trails the
number. To do that we simple put an explicit minus sign inside the field
specification, at the end:
print form 'Thy score be: {]]]].[[-}',
@nums;
which would then print:
Thy score be: 1.0
Thy score be: 1.2-
Thy score be: 1.23
Thy score be: 11.234-
Thy score be: 111.235
Thy score be: 12345.67-
form also understands the common financial usage where negative
numbers are represented as positive numbers in parentheses. Once again,
we draw an abstract picture of what we want (by putting parens at either
end of the field specification):
print form 'Thy dividend be: {(]]]].[[)}',
@nums;
and form obliges:
Thy dividend be: 1.0
Thy dividend be: (1.2)
Thy dividend be: 1.23
Thy dividend be: (11.234)
Thy dividend be: 111.235
Thy dividend be: (12345.67)
Note that the parens have to go inside the field's braces. Otherwise, they're just literal parts of the format string:
print form 'Thy dividend be: ({]]]].[[})',
@nums;
and we'd get:
Thy dividend be: ( 1.0 )
Thy dividend be: ( -1.2 )
Thy dividend be: ( 1.23 )
Thy dividend be: ( -11.234)
Thy dividend be: ( 111.235)
Thy dividend be: (#####.###)
If we add so-called "thousands separators" inside a numeric field at the
usual places, form includes them appropriately in its output. It can
handle the five major formatting conventions:
my @nums = (0, 1, 1.1, 1.23, 4567.89, 34567.89, 234567.89, 1234567.89);
print form
"Brittannic Continental Subcontinental Tyrolean Asiatic",
"_____________ _____________ ______________ _____________ _____________",
"{],]]],]]].[} {].]]].]]],[} {]],]],]]].[} {]']]]']]],[} {]]]],]]]].[}",
@nums, @nums, @nums, @nums, @nums;
to produce:
Brittannic Continental Subcontinental Tyrolean Asiatic
_____________ _____________ ______________ _____________ _____________
0.0 0,0 0.0 0,0 0.0
1.0 1,0 1.0 1,0 1.0
1.1 1,1 1.1 1,1 1.1
1.23 1,23 1.23 1,23 1.23
4,567.89 4.567,89 4,567.89 4'567,89 4567.89
34,567.89 34.567,89 34,567.89 34'567,89 3,4567.89
234,567.89 234.567,89 2,34,567.89 234'567,89 23,4567.89
1,234,567.89 1.234.567,89 12,34,567.89 1'234'567,89 123,4567.89
It also accepts a space character as a "thousands separator" (with, of course, any decimal marker we might like):
print form
"Hyperspatial",
"_____________",
"{] ]]] ]]]:[}",
@nums;
to produce:
Hyperspatial
_____________
0:0
1:0
1:1
1:23
4 567:89
34 567:89
234 567:89
1 234 567:89
Of course, sometimes we don't know ahead of time just where in the world our
formatted numbers will be displayed. Locales were invented to address that
very problem, and form supports them.
If we use the :locale option, form detects the current locale and
converts any numerical formats it finds to the appropriate layout. For
example, if we wrote:
@nums = ( 1, -1.2, 1.23, -11.234, 111.235, -12345.67);
print form
"{],]]],]]].[[}",
@nums;
then we'd get:
1.0
-1.2
1.23
-11.234
111.235
-12,345.67
wherever the program was run. But if we had written:
print form
:locale,
"{],]]],]]].[[}",
@nums;
then we'd get:
1.0
-1.2
1.23
-11.234
111.235
-12,345.67
or:
1,0
1,2-
1,23
11,23-
111,235
12.345,67-
or:
1,0
(1,2)
1,23
(11,23)
111,235
(12'345,67)
or whatever else the current locale indicated was the correct local layout for numbers.
That is, when the :locale option is specified, form ignores the actual
decimal point, thousands separator, and negation sign we specified in the call,
and instead uses the values for these markers that are returned by the
POSIX localeconv function. That means that we can specify our numerical
formatting in a style that seems natural to us, and at the same time
allow the numbers to be formatted in a style that seems natural to the user.
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
Wait a minute...
Where exactly did we conjure that :locale syntax from?
And what, exactly, did it create? What is an "option"?
Well, we're passing :locale as an argument to form, and form's
signature guarantees us
that it can only accept a Str, or an Array, or a Pair as an
argument. So an "option" must be one of those three types, and that
funky :identifier syntax must be a constructor for the equivalent data
structure.
And indeed, that's the case. An "option" is just a pair, and the
funky :identifier syntax is just another way of writing a pair
constructor.
The standard "option" syntax is:
:key( "value" )
which is identical in effect to:
key => "value"
Both specify an autoquoted key; both associate that key with a value; both evaluate to a pair object that contains the key and value. So why have a second syntax for pairs?
Because it allows us to optimize the pair constructor syntax in two different ways. The now-familiar "fat arrow" pair constructor takes a key and a value, each of which can be of any type. In contrast, the key of an "option" pair constructor can only be an identifier, which is always autoquoted...at compile-time. So, if we use the "option" syntax we're guaranteed that the key of the resulting pair is a string, that the string that contains a valid identifier, and that the compiler can check that validity before the program starts.
Moreover, whereas the "fat arrow" has only one syntax, "options" have several highly useful syntactic variations. For example, "fat arrow" pairs can be especially annoying when we want to use them to pass named boolean arguments to a subroutine. For example:
duel( $person1, $person2, to_death=>1, no_quarter=>1, left_handed=>1, bonetti=>1, capoferro=>1 );
In contrast, "options" have a special default behaviour. If we leave off their
parenthesized value entirely, the implied value is 1. So we could rewrite
the preceding function call as:
duel( $person1, $person2, :to_death, :no_quarter, :left_handed, :bonetti, :capoferro );
Better still, when we have a series of options, we don't have to put commas between them:
duel( $person1, $person2, :to_death :no_quarter :left_handed :bonetti :capoferro );
That makes them even more concise and uncluttered, especially in
use statements:
use POSIX :errno_h :fcntl_h :time_h;
There are other handy "option" variants as well, all of which simply substitute the parentheses following their key for some other kind of bracket (and hence some other kind of value). The full list of "option"...err...options is:
Option syntax Is equivalent to
================== =============================
:key("some value") key => "some value"
:key key => 1
:key{ a=>1, b=>2 } key => { a=>1, b=>2 }
:key{ $^arg * 2; } key => { $^arg * 2; }
:key[ 1, 2, 3, 4 ] key => [ 1, 2, 3, 4 ]
:key«eat at Joe's» key => ["eat", "at", "Joe's"]
Despite the deliberate differences in conciseness and flexibility, we can use "options" and "fat arrows" interchangeably in almost every situation where we need to construct a pair (except, of course, where the key needs to be something other than an identifier string, in which case the "fat arrow" is the only alternative). To illustrate that interchangeability, we'll use the "option" syntax throughout most of the rest of this discussion, except where using a "fat arrow" is clearly preferable for code readability.
Meanwhile, back in the fields...
Formatting numbers gets even trickier when those numbers represent money.
But form simply lets us specify how the local currency looks –
including leading, trailing, or infix currency markers; leading, trailing, or
circumfix negation markers; thousands separators; etc. – and then it
formats it that way. For example:
my @amounts = (0, 1, 1.2345, 1234.56, -1234.56, 1234567.89);
my %format = (
"Canadian (English)" => q/ {-$],]]],]]].[}/,
"Canadian (French)" => q/ {-] ]]] ]]],[ $}/,
"Dutch" => q/ {],]]],]]].[-EUR}/,
"German (pre-euro)" => q/ {-].]]].]]],[DM}/,
"Indian" => q/ {-]],]],]]].[ Rs}/,
"Norwegian" => q/ {kr -].]]].]]],[}/,
"Portuguese (pre-euro)" => q/ {-].]]].]]]$[ Esc}/,
"Swiss" => q/{Sfr -]']]]']]].[}/,
);
for %format.kv -> $nationality, $layout {
print form "$nationality:",
" $layout",
@amounts,
"\n";
}
produces:
Swiss:
Sfr 0.0
Sfr 1.0
Sfr 1.23
Sfr 1'234.56
Sfr -1'234.56
Sfr 1'234'567.89
Canadian (French):
0,0 $
1,0 $
1,23 $
1 234,56 $
-1 234,56 $
1 234 567,89 $
Dutch:
0.0EUR
1.0EUR
1.23EUR
1,234.56EUR
1,234.56-EUR
1,234,567.89EUR
Norwegian:
kr 0,0
kr 1,0
kr 1,23
kr 1.234,56
kr -1.234,56
kr 1.234.567,89
German (pre-euro):
0,0DM
1,0DM
1,23DM
1.234,56DM
-1.234,56DM
1.234.567,89DM
Indian:
0.0 Rs
1.0 Rs
1.23 Rs
1,234.56 Rs
-1,234.56 Rs
12,34,567.89 Rs
Portuguese (pre-euro):
0$0 Esc
1$0 Esc
1$23 Esc
1.234$56 Esc
-1.234$56 Esc
1.234.567$89 Esc
Canadian (English):
$0.0
$1.0
$1.23
$1,234.56
-$1,234.56
$1,234,567.89
Nice, eh?
But sometimes too nice. Sometimes all we want is an existing block of data laid out into columns – without any fancy reformatting or rejustification. For example, suppose we have an interesting string like this:
$diagram = <<EODNA;
G==C
A==T
T=A
A=T
T==A
G===C
T==A
C=G
TA
AT
A=T
T==A
G===C
T==A
EODNA
and we'd like to put beside some other text. Because it's already carefully formatted, we really don't want to interpolate it into a left-justified field:
print form
'{[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]]} {[[[[[[[[[[[[[[[}',
$diatribe, $diagram;
Because that would squash our lovely helix:
Men at some time are masters of their G==C
fates: / the fault, dear Brutus, is not in A==T
our genes, / but in ourselves, that we are T=A
underlings. / Brutus and Caesar: what A=T
should be in that 'Caesar'? / Why should T==A
that DNA be sequenced more than yours? / G===C
Extract them together, yours is as fair a T==A
genome; / transcribe them, it doth become C=G
mRNA as well; / recombine them, it is as TA
long; clone with 'em, / Brutus will start a AT
twin as soon as Caesar. / Now, in the names A=T
of all the gods at once, / upon what T==A
proteins doth our Caesar feed, / that he is G===C
grown so great? T==A
Nor would right-, full-, centre- or numeric- justification help in this instance. What we really need is "leave-it-the-hell-alone" justification – a field specifier that lays out the data exactly as it is, leading whitespace included.
And that's the purpose of a verbatim field. A verbatim single-line field
({'''''''''}) grabs the next line of data it's offered and inserts as
much of it as will fit in the field's width, preserving whitespace "as
is". Likewise a verbatim block field ({"""""""""}) grabs every line
of the data it's offered and interpolates it into the text without any
reformatting or justification.
And that's precisely what we needed for our diagram:
print form
'{[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]]} {"""""""""""""""}',
$diatribe, $diagram;
to produce:
Men at some time are masters of their G==C
fates: / the fault, dear Brutus, is not in A==T
our genes, / but in ourselves, that we are T=A
underlings. / Brutus and Caesar: what A=T
should be in that 'Caesar'? / Why should T==A
that DNA be sequenced more than yours? / G===C
Extract them together, yours is as fair a T==A
genome; / transcribe them, it doth become C=G
mRNA as well; / recombine them, it is as TA
long; clone with 'em, / Brutus will start a AT
twin as soon as Caesar. / Now, in the names A=T
of all the gods at once, / upon what T==A
proteins doth our Caesar feed, / that he is G===C
grown so great? T==A
Note that, unlike other types of fields, verbatim fields don't break and wrap their data if that data doesn't fit on a single line. Instead, they truncate each line to the appropriate field width. So a too-short verbatim field:
print form
'{[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]]} {""""""}',
$diatribe, $diagram;
results in gene slicing:
Men at some time are masters of their G==C
fates: / the fault, dear Brutus, is not in A==
our genes, / but in ourselves, that we are T
underlings. / Brutus and Caesar: what A
should be in that 'Caesar'? / Why should T==
that DNA be sequenced more than yours? / G===C
Extract them together, yours is as fair a T==A
genome; / transcribe them, it doth become C=G
mRNA as well; / recombine them, it is as TA
long; clone with 'em, / Brutus will start a AT
twin as soon as Caesar. / Now, in the names A=T
of all the gods at once, / upon what T==A
proteins doth our Caesar feed, / that he is G===
grown so great? T=
rather than teratogenesis:
Men at some time are masters of their G==C
fates: / the fault, dear Brutus, is not in A=-
our genes, / but in ourselves, that we are =T
underlings. / Brutus and Caesar: what -
should be in that 'Caesar'? / Why should T=A
that DNA be sequenced more than yours? / -
Extract them together, yours is as fair a A=T
genome; / transcribe them, it doth become T=-
mRNA as well; / recombine them, it is as =A
long; clone with 'em, / Brutus will start a G===C
twin as soon as Caesar. / Now, in the names T==A
of all the gods at once, / upon what C=G
proteins doth our Caesar feed, / that he is TA
grown so great? AT
A=T
T==A
G==-
=C
T-
==A
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
It's not uncommon for a report to need a series of data fields in one column and then a second column with only single field, perhaps containing a summary or discussion of the other data. For example, we might want to produce recipes of the form:
=================[ Hecate's Broth of Ambition ]=================
Preparation time: Method:
66.6 minutes Remove the legs from the
lizard, the wings from the
Serves: owlet, and the tongue of the
2 doomed souls adder. Set them aside.
Refrigerate the remains (they
Ingredients: can be used to make a lovely
2 snakes (1 fenny, 1 white-meat stock). Drain the
adder) newts' eyes if using pickled.
2 lizards (1 legless, Wrap the toad toes in the
1 regular) bat's wool and immerse in half
3 eyes of newt (fresh a pint of vegan stock in
or pickled) bottom of a preheated
2 toad toes (canned cauldron. (If you can't get a
are fine) fresh vegan for the stock, a
2 cups of bat's wool cup of boiling water poured
1 dog tongue over a vegetarian holding a
1 common or spotted sprouted onion will do). Toss
owlet in the fenny snake, then the
legless lizard. Puree the
tongues together and fold
gradually into the mixture,
stirring widdershins at all
times. Allow to bubble for 45
minutes then decant into two
tarnished copper chalices.
Garnish each with an owlet
wing, and serve immediately.
There are several ways to achieve that effect. The most obvious is to format each column separately and then lay them out side-by-side with a pair of verbatim fields:
my $prep = form 'Preparation time: ',
' {<<<<<<<<<<<<<<<<<<<<}', $prep_time,
' ',
'Serves: ',
' {<<<<<<<<<<<<<<<<<<<<}', $serves,
' ',
'Ingredients: ',
' {[[[[[[[[[[[[[[[[[[[[}', $ingredients;
my $make = form 'Method: ',
' {[[[[[[[[[[[[[[[[[[[[[[[[[[[[}',
$method;
print form
'=================[ {||||||||||||||||||||||||||} ]=================',
$recipe,
' ',
' {"""""""""""""""""""""""} {"""""""""""""""""""""""""""""""} ',
$prep, $make;
We could even chain the calls to form to eliminate the interim variables:
print form
'=================[ {||||||||||||||||||||||||||} ]=================',
$recipe,
' ',
' {"""""""""""""""""""""""} {"""""""""""""""""""""""""""""""} ',
form('Preparation time: ',
' {<<<<<<<<<<<<<<<<<<<<}', $prep_time,
' ',
'Serves: ',
' {<<<<<<<<<<<<<<<<<<<<}', $serves
' ',
'Ingredients: ',
' {[[[[[[[[[[[[[[[[[[[[}', $ingredients,
),
form('Method: ',
' {[[[[[[[[[[[[[[[[[[[[[[[[[[[[}',
$method,
);
While it's impressive to be able to do that kind of nested formatting (and highly useful in extreme formatting scenarios), it's also far too ungainly for regular use. A cleaner, more maintainable solution is use a single format and just build the method column up piecemeal, like so:
print form
'=================[ {||||||||||||||||||||||||||} ]=================',
$recipe,
' ',
'Preparation time: Method: ',
' {<<<<<<<<<<<<<<<<<<<<} {<<<<<<<<<<<<<<<<<<<<<<<<<<<…} ',
$prep_time, $method,
' {…<<<<<<<<<<<<<<<<<<<<<<<<<<…} ',
$method,
'Serves: {…<<<<<<<<<<<<<<<<<<<<<<<<<<…} ',
$method,
' {<<<<<<<<<<<<<<<<<<<<} {…<<<<<<<<<<<<<<<<<<<<<<<<<<…} ',
$serves, $method,
' {…<<<<<<<<<<<<<<<<<<<<<<<<<<…} ',
$method,
'Ingredients: {…<<<<<<<<<<<<<<<<<<<<<<<<<<…} ',
$method,
' {[[[[[[[[[[[[[[[[[[[[} {…[[[[[[[[[[[[[[[[[[[[[[[[[[[} ',
$ingredients, $method;
That produces exactly the same result as the previous versions, because
each follow-on {…<<<<<<<…} field in the
"Method" column grabs one extra line from $method, and then the final
follow-on {…[[[[[[…} field grabs as many more as are required
to lay out the rest of the contents of the variable. The only down-side is
that the resulting code is still downright ugly. With all those tedious
repetitions of the same variable, there's far too much $method
in our madness.
Having a series of follow-on fields like this – vertically
continuing a single column across subsequent format lines – is so
common that form provides a special shortcut: the {VVVVVVVVV}
overflow field.
An overflow field automagically duplicates the field specification immediately above it. The important point being that, because that duplication includes copying the preceding field's data source, overflow fields don't require a separate data source of their own.
Using overflow fields, we could rewrite our quotation generator like this:
print form
'=================[ {||||||||||||||||||||||||||} ]=================',
$recipe,
' ',
'Preparation time: Method: ',
' {<<<<<<<<<<<<<<<<<<<<} {<<<<<<<<<<<<<<<<<<<<<<<<<<<<} ',
$prep_time, $method,
' {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ',
'Serves: {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ',
' {<<<<<<<<<<<<<<<<<<<<} {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ',
$serves,
' {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ',
'Ingredients: {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ',
' {[[[[[[[[[[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ',
$ingredients,
' {VVVVVVVVVVVVVVVVVVVVVVVVVVVV} ';
Which would once again produce the recipe shown earlier.
Note that the overflow fields interact equally well in formats with single-line and block fields. That's because block overflow fields have one other special feature: they're non-greedy. Unless we specify otherwise, all types of block fields will consume their entire data source. For example, if we wrote:
print form :layout«across»,
'{<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>…}',
$speech,
'{…<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>…}',
$speech,
'{…[[[[[]]]]]…} {="""""""""""""""""""=} {…[[[[[]]]]]]…}',
$speech, $advert, $speech,
'{…[[[[[[[[[[[[[[[[[[[[[[[[[]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]}',
$speech;
we'd get:
Now is the winter of our discontent / Made glorious summer
by this sun of York; / And all the clouds that lour'd upon
our house / In the deep bosom
of the ocean buried. / Now
are our brows bound with
victorious wreaths; / Our
bruised arms hung up for
monuments; / Our stern
alarums +---------------------+ changed to
merry | | meetings, / Our
dreadful | Eat at Mrs Miggins! | marches to
delightful | | measures. Grim-
visaged war +---------------------+ hath smooth'd
his wrinkled front; / And
now, instead of mounting
barded steeds / To fright the
souls of fearful
adversaries, / He capers
nimbly in a lady's chamber.
That's because the two {…[[[[[]]]]]…} block fields
on either side of the verbatim advertisement field will eat all the
data in $speech, leaving nothing for the final format. Then
the advertisement will be centred on the two resulting columns of text.
But, block overflow fields are different. They only take as many lines as are required to fill the lines generated by the non-overflow fields in their format. So, if we changed our code to use overflows:
print form :layout«across»
'{<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>}', $speech,
'{VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}',
'{VVVVVVVVVVVV} {="""""""""""""""""""=} {VVVVVVVVVVVVV}', $advert,
'{VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}';
we get both a cleaner specification and a more elegant result:
Now is the winter of our discontent / Made glorious summer
by this sun of York; / And all the clouds that lour'd upon
our house / In the deep bosom
of the ocean +---------------------+ buried. / Now
are our brows | | bound with
victorious | Eat at Mrs Miggins! | wreaths; / Our
bruised arms | | hung up for
monuments; / +---------------------+ Our stern
alarums changed to
merry meetings, / Our dreadful marches to delightful
measures. Grim-visaged war hath smooth'd his wrinkled
front; / And now, instead of mounting barded steeds / To
fright the souls of fearful adversaries, / He capers
nimbly in a lady's chamber.
Notice that, in the third format line of the previous example, the two
overflow fields on either side of the advertisement are each overflowing
from the single field that's above both of them. This kind of multiple
overflow is fine, but it does require that we specify how the various
fields overflow (i.e. as two separate columns of text, or – as in
this case – as a single, broken column across the page). That's
the purpose of the :layout«across» option on the
first line. This option is explained in detail below.
The {VVVVVVVV} fields only consumed as much data from $speech as
was required to sandwich the output lines created by the verbatim
advertisement. This feature is important, because it means we can lay
out a series of block fields in one column and a single overflowed field
in another column without introducing ugly gaps. For example, because
the {VVVVVVVVV} fields in:
print form
"Name: ",
" {[[[[[[[[[[[[} ", $name,
" Biography: ",
"Status: {<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<}", $bio,
" {[[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}", $status,
" {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
"Comments: {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
" {[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}", $comments;
only consume as much of the overflowing $bio field as necessary,
the result is something like:
Name:
William
Shakespeare
Biography:
Status: William Shakespeare was born on
Deceased (1564 April 23, 1564 in Strathford-upon-
-1616) Avon, England; he was third of
eight children from Father John
Comments: Shakespeare and Mother Mary Arden.
Theories Shakespeare began his education at
abound as to the age of seven when he probably
the true attended the Strathford grammar
author of his school. The school provided
plays. The Shakespeare with his formal
prime education. The students chiefly
alternative studied Latin rhetoric, logic, and
candidates literature. His knowledge and
being Sir imagination may have come from his
Francis reading of ancient authors and
Bacon, poetry. In November 1582,
Christopher Shakespeare received a license to
Marlowe, or marry Anne Hathaway. At the time of
Edward de their marriage, Shakespeare was 18
Vere years old and Anne was 26. They had
three children, the oldest Susanna,
and twins- a boy, Hamneth, and a
girl, Judith. Before his death on
April 23 1616, William Shakespeare
had written thirty-seven plays. He
is generally considered the
greatest playwright the world has
ever known and has always been the
world's most popular author.
If {VVVVVVVVVVV} fields ate their entire data – the way
{[[[[[[[[[} or {IIIIIIIIII} fields do – then the output would be
much less satisfactory. The first block overflow field for $bio would
have to consume the entire biography, before the comments field was even
reached. So our output would be something like:
Name:
William
Shakespeare
Biography:
Status: William Shakespeare was born on
Deceased (1564 April 23, 1564 in Strathford-upon-
-1616) Avon, England; he was third of
eight children from Father John
Shakespeare and Mother Mary Arden.
Shakespeare began his education at
the age of seven when he probably
attended the Strathford grammar
school. The school provided
Shakespeare with his formal
education. The students chiefly
studied Latin rhetoric, logic, and
literature. His knowledge and
imagination may have come from his
reading of ancient authors and
poetry. In November 1582,
Shakespeare received a license to
marry Anne Hathaway. At the time of
their marriage, Shakespeare was 18
years old and Anne was 26. They had
three children, the oldest Susanna,
and twins- a boy, Hamneth, and a
girl, Judith. Before his death on
April 23 1616, William Shakespeare
had written thirty-seven plays. He
is generally considered the
greatest playwright the world has
ever known and has always been the
world's most popular author.
Comments:
Theories
abound as to
the true
author of his
plays. The
prime
alternative
candidates
being Sir
Francis
Bacon,
Christopher
Marlowe, or
Edward de
Vere
Which is precisely why {VVVVVVVVVVV} fields don't work that way.
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
When it comes to specifying the data source for each field in a format,
form offers several alternatives as to where that data placed,
several alternatives as to the order in which that data is extracted, and
an option that lets us control how the data is fitted into each field.
Whenever a field is passed more data than it can
accommodate in a single line, form is forced to "break" that data somewhere.
If the field in question is W
columns wide, form first squeezes any whitespace (as specified by
the user's :ws option) and then looks at the next W columns of the string.
(Of course, that might actually correspond to less than W characters
if the string contains wide characters. However, for the sake of exposition
we'll pretend that all characters are one column wide here.)
form's breaking algorithm then searches for a newline, a carriage
return, any other whitespace character, or a hyphen. If it
finds a newline or carriage return within the first W columns, it
immediately breaks the data string at that point. Otherwise it locates
the last whitespace or hyphen in the first W columns and breaks
the string immediately after that space or hyphen. If it can't find
anywhere suitable to break the string, it breaks it at the (W-1)th
column and appends a hyphen.
So, for example:
$data = "You can play no part but Pyramus;\nfor Pyramus is a sweet-faced man";
print form "|{[[[[[}|",
$data;
prints:
|You can|
|play no|
|part |
|but |
|Pyramu-|
|s; |
|for |
|Pyramus|
|is a |
|sweet- |
|faced |
|man |
Note the line-breaks after can (at a whitespace), part (after a whitespace), sweet- (after a hyphen), and s; (at a newline). Note too that Pyramus; doesn't fit in the field, so it has to be chopped in two and a hyphen inserted.
Of course, this particular style of line-breaking may not be suitable to all
applications, and we might prefer that form use some other algorithm. For
example, if form used the TeX breaking algorithm it would have broken
Pyramus; less clumsily, yielding:
|You can|
|play no|
|part |
|but |
|Pyra- |
|mus; |
|for |
|Pyramus|
|is a |
|sweet- |
|faced |
|man |
To support different line-breaking strategies form provides
the :break option. The :break option's value must be
a closure/subroutine, which will then be called whenever a data string
needs to be broken to fit a particular field width.
That subroutine is passed three arguments: the data
string itself, an integer specifying how wide the field is, and a regex
indicating which (if any) characters are to be squeezed.
It is expected to return a list of two values: a string which is taken
as the "broken" text for the field, and a boolean value indicating
whether or not any data remains after the break (so form knows when
to stop breaking the data string). The subroutine is also expected to
update the .pos of the data string to point immediately after the
break it has imposed.
For example, if we always wanted to break at the exact width of the field (with no hyphens), we could do that with:
sub break_width ($data is rw, $width, $ws) {
given $data {
# Treat any squeezed or vertical whitespace as a single character
# (since they'll subsequently be squeezed to a single space)
my rule single_char { <$ws> | \v+ | . }
# Give up if there are no more characters to grab...
return ("", 0) unless m:cont/ (<single_char><1,$width>) /;
# Squeeze the resultant substring...
(my $result = $1) ~~ s:each/ <$ws> | \v+ /\c[SPACE]/;
# Check for any more data still to come...
my bool $more = m:cont/ <before: .* \S> /;
# Return the squeezed substring and the "more" indicator...
return ($result, $more);
}
}
print form
:break(&break_width),
"|{[[[[[}|",
$data;
producing:
|You can|
|play no|
|part bu|
|t Pyram|
|us; for|
|Pyramus|
|is a sw|
|eet-fac|
|ed man |
Or we might prefer to break on every single whitespace-separated word:
sub break_word ($data is rw, $width, $ws) {
given $data {
# Locate the next word (no longer than $width cols)
my $found = m:cont/ \s* $?word:=(\S<1,$width>) /;
# Fail if no more words...
return ("", 0) unless $found{word};
# Check for any more data still to come...
my bool $more = m:cont/ <before: .* \S> /;
# Otherwise, return broken text and "more" flag...
return ($found{word}, $more);
}
}
print form
:break(&break_word),
"|{[[[[[}|",
$data;
producing:
|You |
|can |
|play |
|no |
|part |
|but |
|Pyramus|
|; |
|for |
|Pyramus|
|is |
|a |
|sweet-f|
|aced |
|man |
We'll see yet another application of user-defined breaking when we discuss user-defined fields.
There are (at least) three schools of thought when it comes to setting
out a call to form that uses more than one format. The
"traditional" way (i.e. the way Perl 5 formats do it) is to interleave
each format string with a line containing the data it is to
interpolate, with each datum aligned directly under the field into
which it is to be fitted. Like so:
print form
"Name: ",
" {[[[[[[[[[[[[} ",
$name,
" Biography: ",
"Status: {<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<}",
$bio,
" {[[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
$status,
" {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
"Comments: {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
" {[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
$comments;
This approach has the advantage that it self-documents: to know what a particular field is supposed to contain, we merely need to look down one line.
It does, however, break up the "abstract picture" that the formats portray, which can make it more difficult to envisage what the final formatted text will look like. So some people prefer to put all the data to the right of the formats:
print form
"Name: ",
" {[[[[[[[[[[[[} ", $name,
" Biography: ",
"Status: {<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<}", $bio,
" {[[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}", $status,
" {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
"Comments: {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
" {[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}", $comments;
And that's perfectly acceptable too.
Sometimes, however, the data to be interpolated doesn't come neatly
pre-packaged in separate variables that are easy to intersperse between the
formats. For example, the data might be a list returned by a
subroutine call () or might be stored in a hash
( get_info%person{« name biog stat comm »} ). In such
cases it's a nuisance to have to tease that data out into separate
variables (or hash accesses) and then sprinkle them through the formats:
print form
"Name: ",
" {[[[[[[[[[[[[} ",%person{name},
" Biography: ",
"Status: {<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<}",%person{biog},
" {[[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",%person{stat},
" {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
"Comments: {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",
" {[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}",%person{comm};
So form has an option that lets us put a single, multi-line format
at the start of the argument list, place all the data together
after it, and have that data automatically interleaved as necessary.
Not surprisingly, that option is: :interleave. It's normally used in
conjunction with a heredoc, since that's the easiest way to specify a
multi-line string in Perl:
print form :interleave, <<'EOFORMAT',
Name:
{[[[[[[[[[[[[}
Biography:
Status: {<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<}
{[[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}
{VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}
Comments: {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}
{[[[[[[[[[[[} {VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV}
EOFORMAT
%person{« name biog stat comm »}
When :interleave is in effect, form grabs the first string
argument it's passed and breaks that argument up into individual lines.
It treats those individual lines as a series of distinct formats
and grabs as many of the remaining arguments as are required to
provide data for each format.
Of course, in this example we're also taking advantage of the new indenting behaviour of heredocs. The "Name:", "Status:", and "Comments:" titles are actually at the very beginning of their respective lines, because the start of a Perl 6 heredoc terminator marks the left margin of the entire heredoc string.
It's important to point out that, even when we're using form's
default non-interleaving behaviour, it's still okay to use a format
that spans multiple lines. There is however a significant (and useful)
difference in behaviour between the two alternatives.
The normal behaviour of form is to take each format string,
fill in each field in the format with a substring from the
corresponding data source, and then repeat that process until all the
data sources have been exhausted. Which means that a multi-line format
like this:
print form
<<'EOFORMAT',
Name: {[[[[[[[[[[[[[[[} Role: {[[[[[[[[[[}
Address: {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[}
_______________________________________________
EOFORMAT
@names, @roles, @addresses;
would normally produce this:
Name: King Lear Role: Protagonist
Address: The Cliffs, Dover
_______________________________________________
Name: The Three Witches Role: Plot devices
Address: Dismal Forest, Scotland
_______________________________________________
Name: Iago Role: Villain
Address: Casa d'Otello, Venezia
_______________________________________________
because the entire three-line format is repeatedly filled in as a single unit, line-by-line and datum-by-datum.
On the other hand, if we tell form that it's supposed to automatically
interleave the data coming after the format, like so:
print form :interleave,
<<'EOFORMAT',
Name: {[[[[[[[[[[[[[[[} Role: {[[[[[[[[[[}
Address: {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[}
_______________________________________________
EOFORMAT
@names, @roles, @addresses;
then the call produces:
Name: King Lear Role: Protagonist
Name: The Three Witches Role: Plot devices
Name: Iago Role: Villain
Address: The Cliffs, Dover
Address: Dismal Forest, Scotland
Address: Casa d'Otello, Venezia
_______________________________________________
because that second version is really equivalent to:
print form
"Name: {[[[[[[[[[[[[[[[} Role: {[[[[[[[[[[}",
@names, @roles,
"Address: {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[}",
@addresses,
"_______________________________________________";
That's not much use in this particular example, but it was exactly what was needed for the biography example earlier. It's just a matter of choosing the right type of data placement to achieve the particular effect we want.
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
As we saw earlier, with follow-on fields and
overflow fields, form
is perfectly happy to have several fields in a single format that
are all fed by the same data source. For example:
print form
"{[[[[[[[[]]]]]]]]]]…} {…[[[[[[[]]]]]]]]]]…} {…[[[[[[[[]]]]]]]]]]}",
$soliloquy, $soliloquy, $soliloquy;
In fact, that kind of format is particularly useful for creating multi-column outputs (like newspaper columns, for example).
But a small quandry arises. In what order should form fill in these
fields? Should the data be formatted down the page, filling each column
completely before starting the next (and therefore potentially leaving
the last column "short"):
Now is the winter of torious wreaths; / front; / And now, in-
our discontent / Made Our bruised arms hung stead of mounting ba-
glorious summer by up for monuments; / rded steeds / To fri-
this sun of York; / Our stern alarums ch- ght the souls of fea-
And all the clouds anged to merry meeti- rful adversaries, /
that lour'd upon our ngs, / Our dreadful He capers nimbly in a
house / In the deep marches to delightful lady's chamber.
bosom of the ocean measures. / Grim-
buried. / Now are our visaged war hath smo-
brows bound with vic- oth'd his wrinkled
Or should the data be run line-by-line across all three columns (the
way a Perl 5 format does it), filling one line completely before
starting the next:
Now is the winter of our discontent / Made glorious summer by
this sun of York; / And all the clouds that lour'd upon our
house / In the deep bosom of the ocean buried. / Now are our
brows bound with vic- torious wreaths; / Our bruised arms hung
up for monuments; / Our stern alarums ch- anged to merry meeti-
ngs, / Our dreadful marches to delightful measures. / Grim-
visaged war hath smo- oth'd his wrinkled front; / And now, in-
stead of mounting ba- rded steeds / To fri- ght the souls of fea-
rful adversaries, / He capers nimbly in a lady's chamber.
Or should the text run down the columns, but in such a way as to leave those columns as evenly balanced in length as possible:
Now is the winter of brows bound with vic- visaged war hath smo-
our discontent / Made torious wreaths; / oth'd his wrinkled
glorious summer by Our bruised arms hung front; / And now, in-
this sun of York; / up for monuments; / stead of mounting ba-
And all the clouds Our stern alarums ch- rded steeds / To fri-
that lour'd upon our anged to merry meeti- ght the souls of fea-
house / In the deep ngs, / Our dreadful rful adversaries, /
bosom of the ocean marches to delightful He capers nimbly in a
buried. / Now are our measures. / Grim- lady's chamber.
Well, of course, there's no "right" answer to that; it depends entirely on what kind of effect we're trying to achieve.
The first approach (i.e. lay out the text down each column first) works
well if we're formatting a news-column, or a report, or a description of
some kind. The second (i.e. lay out the text across each line first), is
excellent for putting diagrams or call-outs in the middle of a piece of
text (as we did for Mrs Miggins). The third approach (i.e. lay out the data downwards but
balance the columns) is best for presenting a single list of data in
multiple columns – like ls does.
So we need an option with which to tell form which of these useful
alternatives we want for a particular format. That option is named
:layout and can take one of three string values: "down", "across",
or "balanced". So, for example, to produce three versions of Richard III's
famous monologue in the order shown above, we'd use:
print form :layout«down»,
"{[[[[[[[[]]]]]]]]]]…} {…[[[[[[[]]]]]]]]]]…} {…[[[[[[[[]]]]]]]]]]}",
$soliloquy, $soliloquy, $soliloquy;
then:
print form :layout«across»,
"{[[[[[[[[]]]]]]]]]]…} {…[[[[[[[]]]]]]]]]]…} {…[[[[[[[[]]]]]]]]]]}",
$soliloquy, $soliloquy, $soliloquy;
then:
print form :layout«balanced»,
"{[[[[[[[[]]]]]]]]]]…} {…[[[[[[[]]]]]]]]]]…} {…[[[[[[[[]]]]]]]]]]}",
$soliloquy, $soliloquy, $soliloquy;
By the way, the default value for the :layout option is "balanced"
since formatting regular columns of data is more common than formatting
news or advertising inserts.
The :layout option controls one other form of inter-column formatting:
tabular layout.
So far, all the examples of tables we've created (for example, our normalized scores) lined up nicely. But that was only because each item in each row happened to take the same number of lines (typically just one). So, a table generator like this:
my @play = map {"$_\r"} ( "Othello", "Richard III", "Hamlet" );
my @name = map {"$_\r"} ( "Iago", "Henry", "Claudius" );
print form
"Character Appears in ",
"____________ ____________",
"{[[[[[[[[[[} {[[[[[[[[[[}",
@name, @play;
correctly produces:
Character Appears in
____________ ____________
Iago Othello
Henry Richard III
Claudius Hamlet
Note that we appended "\r" to each element to add an extra
newline after each entry in the table. We can't use "\n" to specify a
line-break within an array element, because form uses "\n" as an
"end-of-element" marker.
So, to allow line breaks within a single element of an array datum,
form treats "\r" as "end-of-line-but-not-end-of-element"
(somewhat like Perl 5's format does).
However, if we were to use the full titles for each character and each play:
my @play = map {"$_\r"} ( "Othello, The Moor of Venice",
"The Life and Death of King Richard III",
"Hamlet, Prince of Denmark",
);
my @name = map {"$_\r"} ( "Iago",
"Henry,\rEarl of Richmond",
"Claudius,\rKing of Denmark",
);
the same formatter would produce:
Character Appears in
____________ ____________
Iago Othello, The
Moor of
Henry, Venice
Earl of
Richmond The Life and
Death of
Claudius, King Richard
King of III
Denmark
Hamlet,
Prince of
Denmark
The problem is that the two block fields we're using just grab all the data from each array and format it independently into each column. Usually that's fine because the columns are independent (as we've previously seen).
But in a table, the data in each column specifically relates to data
in other columns, so corresponding elements from the column's data
arrays ought to remain vertically aligned. To achieve this, we simply
tell form that the data in the various columns should be laid out
like a table:
print form :layout«tabular»,
"Character Appears in ",
"____________ ____________",
"{[[[[[[[[[[} {[[[[[[[[[[}",
@name, @play;
which then produces the desired result:
Character Appears in
____________ ____________
Iago Othello, The
Moor of
Venice
Henry, The Life and
Earl of Death of
Richmond King Richard
III
Claudius, Hamlet,
King of Prince of
Denmark Denmark
Sometimes we want to use a particular option or combination of options
in every call we make to form. Or, more likely, in every call we make
within a specific scope. For example, we might wish to default to
a different
line-breaking algorithm
everywhere, or we might want to make repeated use of
a new type of field specifier,
or we might want to reset the standard page length from a
printable 60 to a screenable 24.
Normally in Perl 6, if we wanted to preset a particular optional argument we'd simply make an assumption:
my &down_form := &form.assuming(:layout«down»);
But, of course, form collects all of its arguments in a single slurpy array, so it
doesn't actually have a $layout parameter that we can prebind.
Fortunately, the .assuming method is smart enough to recognize when it
being applied to a subroutine whose arguments are slurped. In such cases,
it just prepends any prebound arguments to the resulting subroutine's argument
list. That is, the binding of down_form shown above is equivalent to:
my &down_form :=
sub (FormArgs *@args is context(Scalar)) returns Str {
return form( :layout«down», *@args );
};
form provides one other mechanism by which options can be prebound.
To use it, we (re-)load the Form module with an explicit argument list:
use Form :layout«down», :locale, :interleave;
This causes the module to export a modified version of form in which the
specified options are prebound. That modified version of form is exported
lexically, and so form only has the specified defaults preset for the
scope in which the use Form statement appears.
These default options are handy if we have a series of calls
to form that all need some consistent non-standard behaviour.
For example:
use Form :layout«across»,
:interleave,
:page{ :header("Draft $(localtime)\n\n") };
print form $introduction_format, *@introduction_data;
for @sections -> $format, @data {
print form $format, *@data;
}
print form $conclusion_format, *@conclusion_data;
Another use is to set up a fixed formatting string into which different data
is to be interpolated (much in the way Perl 5 formats are typically used).
For example, we might want a standard format for errors in a CATCH block:
CATCH {
use Form :interleave, <<EOFORMAT;
Error {<<<<<<<}: {[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[}
___________________________________________________
EOFORMAT
when /Missing datum/ { warn form "EMISSDAT", $_.msg }
when /too large/ { warn form "ETOOBIG", $_.msg }
when .core { warn form "EINTERN", "Internal error" }
default { warn form "EUNKNOWN", "Seek help" }
}
|
by Damian Conway
Editor's note: this document is out of date and remains here for historic interest. See Synopsis 7 for the current design information.
All the fields we've seen so far have been exactly as wide as their specifications. That's the whole point of having fields – they allow us to lay out formats "by eye".
But form also allows us to specify field widths in other ways. And better
yet, to avoid specifying them at all and let form work out how big
they should be.
When specific field widths are required (perhaps by some design document or data formatting protocol) laying out wide fields can be error-prone. For example, most people can't visually distinguish between a 52-column field and a 53-column field and are therefore forced to manually verify the width of the corresponding field specifier in some way.
When such fields are part of a larger format, errors like that can
easily result in a call to form producing, say, 81-column lines. That
would merely be messy if the extra characters wrapped, but could
be disasterous if they happened to be chopped instead. Suppose,
for example, that the last 4 columns of output contain nuclear reactor
core temperatures and then consider the difference between an
apparently normal reading of 567 Celsius and what might actually be
happening if the reading were in fact a truncated 5678 Celsius.
To catch mistakes of this kind, fields can be specified with an embedded integer in parentheses (with optional whitespace inside the parens). For example:
print form '{[[[( 15 )[[[[} {<<<<<(17)<<<<<<} {]]](14)]]].[[}',
*@data;
The integer in the parentheses acts like a checksum. Its value must be identical to the actual width of the field (including the delimiting braces and the embedded integer itself). Otherwise an exception is thrown. For instance, running the above example produces the error message:
Inconsistent width for field 3.
Specified as '{]]](14)]]].[[}' but actual width is 15
in call to &form at demo.pl line 1
Numeric fields can be given a decimal checksum, which then also specifies their number of decimal places.
print form
'{[[[( 15 )[[[[} {<<<<<(17)<<<<<<} {]](14.2)]].[}',
*@data;
Note that the digits before the decimal still indicate
the total width of the field. So the {]](14.2)]].[} field
in the above example means must be 14 columns wide, including
2 decimal places, in exactly the same way as a "%14.2f"
specifier would in a sprintf.
Of course, in some instances it would be much more convenient if we
could simply tell form that we want a particular field to be
a particular width, instead of having to explicitly show it.
So there's another type of integer field annotation that, instead of
acting like a checksum, acts like an...err..."tellsum". That is, we
can tell form to ignore a field's physical width and instead
insist that it be magically expanded (or shrunk) to a nominated width. Such
a field is said to have an imperative width. The integer specifying
the imperative width is placed in curly braces instead of parens.
For example, the format in the previous example could be specified imperatively as:
print form
'{[{15}[} {<{17}<<} {]]]]{14.2}]]]].[[}',
*@data;
Note that the actual width of any field becomes irrelevant if it contains an imperative width. The field will be condensed or expanded to the specified width, with subsequent fields pushed left or right accordingly.
Imperative fields disrupt the WYSIWYG layout of a format, so they're generally
only used when the format itself is being generated programmatically. For
example, when we were counting down the top ten reasons not to do one's English Lit homework, we used a fixed-width
{>} field to format each number:
for @reasons.kv -> $n, $reason {
my $n = @reasons - $index ~ '.';
print form " {>} {[[[[[[[[[[[[[[[[[[[[[[[[[[[[}",
$n, $reason,
"";
}
But, of course, there's not reason (theoretically, at least) why we couldn't
find more than 99 reasons not to do our homework, in which case we'd
overflow the {>} field.
So instead of limiting ourselves that way, we could just tell form to make
the first field wide enough to enumerate however many reasons we come up with,
like so:
my $width = length(+@reasons)+1;
for @reasons.kv -> $n, $reason {
my $n = @reasons - $index ~ '.';
print form " {>>{$width}>>} {[[[[[[[[[[[[[[[[[[[[[[[[[[[[}",
$n, $reason,
"";
}
By evaluating @reasons in a numeric context (+@reasons) we determine the
number of reasons we have, and hence the largest number that need ever fit
into the first field. Taking the length of that number ()
gives us the number of digits in that largest number and hence the width of a
field that can format that number. We add one extra column (for the dot
we're appending to each number) and that's our required width. Then we just
tell lengthform to make the first field that wide ({>>{$width}>>}).
A special form of imperative width field is the starred field. A starred field is one that contains an imperative width specification in which the number is replaced by a single asterisk.
The width of a starred field is not fixed, but rather is computed during formatting. That width is whatever is required to cause the entire format to fill the current page width of the format (by default, 78 columns). Consider, for example:
print form
'{]]]]]]]]]]]]]]} {]]].[[} {[[{*}[[} ',
@names, @scores, @comments;
The width of the starred comment field in this case is 49 columns – the default page width of 78 columns minus the 29 columns consumed by the fixed-width portions of the format (including the other two fields).
If a format contains two or more starred fields, the available space is shared equally between them. So, for example, to create two equal columns (say, to compare the contents of two files), we might use:
print form
"{[[[[{*}[[[[} {[[[[{*}[[[[}",
slurp($file1), slurp($file2);
And, yes, Perl 6 does have a built-in slurp function that takes a filename,
opens the file, reads in the entire contents, and returns them as a single
string. For more details see the Perl6::Slurp module (now on the CPAN).
There is one special case for starred fields: a starred verbatim field:
{""""{*}""""}
It acts like any other starred field, growing according to the available space, except that it will never grow any wider than the widest line of the data it is formatting. For example, whereas a regular starred field:
print form
'| {[[{*}[[} |',
$monologue;
expands to the full page width:
| Now is the winter of our discontent |
| Made glorious summer by this sun of York; |
| And all the clouds that lour'd upon our house |
| In the deep bosom of the ocean buried. |
| Now are our brows bound with victorious wreaths |
| Our bruised arms hung up for monuments; |
| Our stern alarums changed to merry meetings, |
| Our dreadful marches to delightful measures. |
| Grim-visaged war hath smooth'd his wrinkled front; |
| And now, instead of mounting barded steeds |
| To fright the souls of fearful adversaries, |
| He capers nimbly in a lady's chamber. |
a starred verbatim field:
print form
'| {""{*}""} |',
$monologue;
only expands as much as is strictly necessary to accommodate the data:
| Now is the winter of our discontent |
| Made glorious summer by this sun of York; |
| And all the clouds that lour'd upon our house |
| In the deep bosom of the ocean buried. |
| Now are our brows bound with victorious wreaths; |
| Our bruised arms hung up for monuments; |
| Our stern alarums changed to merry meetings, |
| Our dreadful marches to delightful measures. |
| Grim-visaged war hath smooth'd his wrinkled front; |
| And now, instead of mounting barded steeds |
| To fright the souls of fearful adversaries, |
| He capers nimbly in a lady's chamber. |
By now you've probably noticed that there is quite a large overlap between the
functionality of form and that of (s)printf. For example, the call:
for @procs {
print form
"{>>>} {<<<<<<<(20)<<<<<<<} {>>>>>>} {>>.}%",
.{pid}, .{cmd}, .{time}, .{cpu};
}