May 2004 Archives

Return of Quiz of the Week

Recently, Perl trainer and former perl.com editor Mark-Jason Dominus revived his Quiz of the Week mailing list; every week, subscribers are sent a Perl task of either "regular" or "expert" level. There are no prizes, but the submitted solutions are collated, discussed, and analyzed. In a way, the prize is the knowledge you gain from looking at various different techniques and approaches to the same problem. Each week, we're going to bring you the analysis from the previous week and the question for you to think about the current week; if you want to join in and submit some solutions, see the Quiz of the Week page above.

This Week's Quiz

The regular quiz this week was submitted by Marco Baringer:

When I was in elementary school I wasted many an hour playing Hangman with my friends.

The goal of the game is to guess a word with a certain (limited) number of wrong guesses. If we fail the "man" gets "hanged"; if we succeed he is set free. (We're not going to discuss the lesson's of life or the justice this game teaches to the 8 year olds who play it regularly).

The game starts out with one person (not the player) choosing a "mystery" word at random and telling the player how many letters the mystery word contains. The player then guesses letters, one at a time, and the mystery word's letters are filled in until a) the entire word is filled in, or b) the maximum number of wrong guesses are reached and the the player loses (man is hanged).

Write a Perl program that lets the user play Hangman. The program should take the following arguments:

  • The dictionary file to use
  • The maximum number of wrong guesses to give the player

The program must then chose a mystery word from the dictionary file and print out as many underscores ("_") as there are letters in the mystery word. The program will then read letters from the user one at a time. After each guess the program must print the word with properly guessed letters filled in. If the word has been guessed (all the letters making up the word have been guessed) then the program must print "LIFE!" and exit. If the word is not guessed before the maximum number of guesses is reached then the program must print "DEATH!" and exit.

Some additional requirements:

  1. The dictionary file will contain one word per line and use only 7-bit ASCII characters. It may contain randomly generated words. The dictionary will contain only words longer than 1 character. The size of the dictionary may be very large.
  2. The dictionary file used for the test (or the program for generating it) will be made available along with the write-up.
  3. If a letter appears more than once in the mystery word, all occurrences of that letter must be filled in. So, if the word is 'bokonon' and the player guesses 'o' the output must be '_o_o_o_'.

The concensus on the discussion list seems to be that minor alterations and improvements to the user interface are OK, and that the expert quiz will be to write a program that efficiently implements the other side of the interface: to play the game of Hangman against the server.

And now, onto the discussion of last week's quiz...

Last Week's Regular Quiz

Geoffrey Rommel sent the following question:

The usual way to look for a character string in files in Unix is to use grep. For instance, let's say you want to search for the word 'summary' without regard to case in all files in a certain directory. You might say:

grep -i summary *

But if there is a very large number of files in your directory, you will get something like this:

ksh: /usr/bin/grep: arg list too long

Now, you could just issue multiple commands, like this:

grep -i summary [A-B]*
grep -i summary [C-E]*
etc.

... but that's so tedious.

Write a Perl program that allows you to search all files in such a directory with one command.

And Geoffrey's solution:

This quiz was suggested to me by a directory on one of my servers where all of our executable scripts are stored. This directory now has over 4,200 scripts and has gotten too big to search.

The solution shown here works for my purposes, but I do not wish to depreciate the ingenious solutions found on the discussion list. I will try to evaluate and discuss them in a separate message.

As MJD mentioned, Perl regex matching is clearly superior to the alternatives. Since the original purpose was to search a directory of scripts, the search is not case-sensitive; that option could be added easily enough. We search only files (-f) in the specified directory, not in lower directories. I also test for "text" files (-T) because my Telnet client gets hopelessly confused if you start displaying non-ASCII characters.


#!/usr/bin/perl
# The bin directory is too large to search all at once, so this does
# it in pieces.
($PAT, $DIR) = @ARGV[0,1];
$DIR ||= "";
die "Syntax:  q16 pattern directory\n" unless $PAT;

open(LS, "ls -1 $DIR |") or die "Could not ls: $!";

@list = ();
while (<LS>) {
   chomp;
   push @list , (($DIR eq "") ? $_ : "$DIR/$_");
   if (@list >= 800) {
      greptext($PAT, @list);
      @list = ();
   }
}
greptext($PAT, @list);

close LS;
exit;

sub greptext {
 my ($pattern, @files) = @_;

 foreach $fname (@files) {
    next unless -f $fname && -T _;
    open FI, $fname;
    while (<FI>) {
       chomp;
       print "$fname [$.]: $_\n" if m/$pattern/oi;
    }
    close FI;
 }
}

For what it's worth, here is my take on the solutions offered for the week's quiz. This quiz was a humbling reminder that we should specify our requirements exactly. Although I deliberately left some of the design open, I apparently did not make the requirements clear. Three of the submissions failed on the very task they were intended to solve--namely, running through all files in a large directory of files. Instead of specifying a directory name, you had to specify a list of file names on the command line, which of course resulted in the dreaded "arg list too long" message. As someone pointed out, this is really a shell limitation, but there it is. (For the record, I am running MP-RAS, an svr4 variant, with the Korn shell.)


           1   2   3   4   5   F    CPU                                         
brent      1   1   1   1   1   1  65.01                                         
dunham     1   1   1   0   2   2   2.92                                         
fish       1   1   1   0   2   2   3.32                                         
mjd        0   0   0   0   2   2     --                                         
rommel     1   1   1   1   2   0   2.66                                         
scott      1   0   1   1   2   2     --                                         
sims       0   0   0   0   2   2     --                                         
wett       0   0   0   0   2   0     --  

1. Did it work when there was a named pipe in the directory?                    
2. Did it work as desired on the large directory?                               
3. Did it work when only a directory name was specified?                        
4. Did it work (as ls does) when no directory name was specified?               
5. Aesthetics of output (error messages, line numbers, etc.)                    
F. Flexibility of program                                                       
CPU: to search large directory; in seconds as reported by timex                 

Tests 1-2 were requirements. Tests 3-4 were nice to have. Peter Scott's program worked nicely on a small directory but produced unexpected output on the large directory: it seemed to be executing some scripts rather than searching them, so I deemed it to have failed. Brent Royal-Gordon's script got a lower score for aesthetics because it produced a number of lines saying "UX:grep: ERROR: Cannot open --".

I was particularly impressed with the flexibility of many of the scripts; my own submission, although the fastest, was also the least flexible. I am quite in agreement with James Wetterau's defense of minimalism. A program like this would not even be necessary if the shell allowed us to pass 4,200 arguments. Since it doesn't, however, we must do some of the work ourselves.

Last Week's Expert Quiz

This was sent by Shlomi Fish:

You will write a Perl program that schedules the semester of courses at Haifa University. @courses is an array of course names, such as "Advanced Basket Weaving". @slots is an array of time slots at which times can be scheduled, such as "Monday mornings" or "Tuesdays and Thursdays from 1:00 to 2:30". (Time slots are guaranteed not to overlap.)

You are also given a schedule that says when each course meets. $schedule[$n][$m] is true if course $n meets during time slot $m, and false if not.

Your job is to write a function, 'allocate_minimal_rooms', to allocate classrooms to courses. Each course must occupy the same room during every one of its time slots. Two courses cannot occupy the same room at the same time. Your function should produce a schedule that allocates as few rooms as possible.

The 'allocate_minimal_rooms' function will get three arguments:

  1. The number of courses
  2. The number of different time slots
  3. A reference to the @schedule array
It should return a reference to an array, say $room, which indicates the schedule. $room->[$n] will be the number of the room in which course $n will meet during all of its time slots. If courses $n and $m meet at the same time, then $room->[$n] must be different from $room->[$m], because the two courses cannot use the same room at the same time.

Well, this quiz generated several solutions from several people.

MJD sent a test suite, and I sent a test suite of my own.

Roger Burton West sent an exhaustive search solution; Ronald J. Kimball also sent an exhaustive search one, this time using string operations to represent the schedule array.

Christian Duhl identified that the problem was NP-Complete, transformed it to a graph coloring problem, and solved it using this. I sent my own solution. This one uses intermediate truth tables between courses and rooms.

Finally, Mark Jason Dominus posted his solution, which also used recursion as well as string operations.

My solution is smarter than the brute-force method, but still recursive and may explode for certain schedules.

It works by assigning a room to a course, and then finding a course that requires a different room. It then assigns another room to this course. The algorithm maintains a truth table of which courses can be allocated to specific rooms. Once a room was allocated to a class, all of the classes that share time-slots with this class are marked as being unable to use the room. If all the rooms that were allocated so far are unusable by a certain class, then it is allocated a new room.

If the algorithm reaches a place where a room can be allocated to any of several classes, it recurses with each possibility.

This Week on Perl 6, Week Ending 2004-05-23

Yes. I know. This week's summary is a week late. So it's a summary of the last two weeks. So let's get straight to perl6-internals shall we?

Working on the Perl 6 Compiler

Abhijit A. Mahabal posted his first ever patch, which updated the calling conventions used in the Perl 6 Compiler. Dan applied it promptly, and Allison Randal chimed in with some comments on the workings of the perl 6 compiler. There's a new document, which you'll find at languages/perl6/doc/developer_notes.pod in a CVS parrot, which is a scratchpad for developers to make notes of any issues that arise while they're working on the compiler.

http://groups.google.com/groups?selm=Pine.LNX.4.58.0405051446080.25125@jughead.cs.indiana.edu

Pants performing ponies

The issues that cropped up in the last summary to do with poor ponie performance continued to exercise Leo Tötsch, Nicholas Clark and Jeff Clites as they homed in on the Ponie bug that had led to ludicrous numbers of PMCs being generated to no good end. It turns out that Parrot's garbage collection doesn't go that fast when there's 9 million live PMCs. Leo worked to improve things...

http://groups.google.com/groups?selm=200405080801.i4881uR13546@thu8.leo.home

MMD side effects

Leo pointed out some strangeness related to the new multiply dispatched opcodes and proposed a solution. Warnock applies.

http://groups.google.com/groups?selm=409F386C.6040700@toetsch.at

The Parrot Question Repository

Dan announced that he was trying (slowly) to get things documented and questions asked. Because questions to the list tend to get Warnocked or asked repeatedly, he's set up a new central question repository on the Parrot Wiki. He appealed for a volunteer to watch the list and add new questions to the Wiki page and (a possibly different volunteer) to watch the Wiki and post any answers back to the list. If such a volunteer should step up to the plate, I would be inordinately grateful if they could make a habit of commenting to any questions they transplant to the Wiki to let people know that this has been done.

http://groups.google.com/groups?selm=a06100507bcc5484f1123@[10.0.1.3]

http://www.vendian.org/parrot/wiki/bin/view.cgi/Main/HowDoIDo -- Central Questions Point

IO Opinions

Dan asked for opinions on how the IO system was going to work when it came to dealing with record based files. Tim Bunce and Jeff Clites had suggestions and questions.

http://groups.google.com/groups?selm=20040510093459.GA32443@dansat.data-plan.com

The "Right Way Repository"

Dan announced that he'd started an FAQ aimed at compiler writers targetting parrot in docs/compiler_faq.pod in your local CVS parrot. It's a bit sparse, but the proverb about mighty oaks and acorns applies. We hope. Brent Royal-Gordon was first in with a question and answer.

http://groups.google.com/groups?selm=a0610050dbcc55af97118@[10.0.1.3]

Towards a multiply dispatched parrot

Leo continued his work on switching over to a multi dispatched vtable architecture and asked for comments on the work so far, and the sanity of the whole concept. Dan was very positive, and the work continues.

http://groups.google.com/groups?selm=409F693A.30608@toetsch.at

Cygwin

Cygwin continues to be a fully paid up member of the awkward squad of environments. Joshua Gatcomb pointed out that JIT had broken again. He and Leo engaged in debugging by email as Leo worked to fix things.

http://groups.google.com/groups?selm=20040510142851.25802.qmail@web60806.mail.yahoo.com

Perl 6 Parser weirdness

Abhijit A. Mahabal had some problems with the Perl 6 compiler's parser and asked for help. It looks like it's an issue with the load path using relative paths, which makes it fragile when you're running in an unexpected directory.

The discussion evolved into one on the nature of the (eventual) Perl 6 parser. Dan worried that a recursive descent parser will be awfully slow compared to the current perl 5 parser. It looks like Perl 6 won't have a recursive descent parser, but the conversation got rather beyond my level of computer science (at least where parsers are concerned). Larry said that he was in favour of a hybrid parser which used both top down and bottom up parsing techniques.

Dan suggested that since Larry has been "itching to get into the internals anyway, [the parser]'d be a good place to start."

http://groups.google.com/groups?selm=Pine.LNX.4.58.0405082236060.15595@jughead.cs.indiana.edu

Event design sketch

Discussion of Dan's event design sketch continued. Gordon Henriksen was worried that the events system as sketched wouldn't work well with GUI type events. Rocco Caputo pointed out issues with Daylight Saving for timed events. Leo asked for a glossary.

And those are just the items that caught my eye as I went into speed skim mode.

http://groups.google.com/groups?selm=a06100519bcc6a4021201@[10.0.1.3]

PARROT_API, compiler and linker flags

Ron Blaschke has been working on getting Parrot's linker magic right; and he presented his findings and warned us that he intended to go ahead and implement something based on them unless there were substantive objections. There weren't any objections as such, but there were an awful lot of issues raised (in case you didn't already know, cross platform compatibility is Hard).

http://groups.google.com/groups?selm=yeamc5vwjcy4.bx90p9hnpzb$.dlg@40tude.net

Events and names

In response to the discussion of his initial sketch, Dan said that he thought that 'event' might be the wrong word for what he was working. 'Event' fits the bill in some places, but not in others. He asked for suggestions in case someone had a better name. Tim Bunce suggested 'hap', which met with some approval. James Mastros suggested 'page'. Andy Wardley reckoned 'Parcel' might fit the bill. I don't think any of the suggestions got massive approval.

http://groups.google.com/groups?selm=a06100504bcc7f9cafcc4@[10.0.1.3]

New version of parrotbug

Jerome Quelin announced a new version of parrotbug for reporting bugs in parrot.

http://groups.google.com/groups?selm=200405171926.41993.jquelin@mongueurs.net

MMD and objects

Announcing that the basics of MMD are almost done, Dan moved to address issues with objects and inherited methods. He outlined what needed to be done and some of the choices that need to be made.

http://groups.google.com/groups?selm=a06100500bcce66bf8fcd@[10.0.1.3]

Bits and Pieces

Dan announced a few decisions that he'd made about various stuff.

http://groups.google.com/groups?selm=a06100504bccee48acc22@[10.0.1.3]

Non-flow-control logical tests

It turns out that Dan's work project would be a lot easier if Parrot had logical test ops that simply returned 0 or 1 rather than jumping off somewhere. He wondered if there was a real case for actually implementing such things. The general response was favourable so he checked in a new set of isgt, iseq, etc ops in experimental.ops.

http://groups.google.com/groups?selm=a06100500bcd27a671fbd@[172.24.18.98]

Embedding Parrot / mod_parrot

Paul Querna is looking at writing mod_parrot for apache 2.0. He had some questions about how to go about embedding parrot.

http://groups.google.com/groups?selm=1085339196.6099.53.camel@localhost

Meanwhile, in perl6-language

C style conditional statements

Pedro Larroy wondered if there was any chance that, in perl 6 it would be possible to write conditional statements that looked like C type if statements.

    if (condition) 
      statement;

The thread drew quite a lot of traffic, but the answer is that it's not going to happen because it's incompatible with the new Perl 6 way of not requiring parentheses around the condition. For some reason, the thread evolved into the "Perl 6 shouldn't be called Perl!" thread. But set against that, it did elicit the following from Larry:

Indentation is a wonderful form of commentary from programmer to programmer, but its symbology is largely wasted on the computer. We don't tell poets how to format their poetry.

Which certainly made me smile.

http://groups.google.com/groups?selm=20040512000042.GC18109@larroy.com

Yadda yadda yadda some more

Aaron Sherman is a big fan of .... Lots of folks seemed to like it too, and the thread became something of a love in. Let's hear it for yadda yadda yadda.

http://groups.google.com/groups?selm=1084397404.13150.97.camel@pps

RFC eq and ==

It was Pedro Larroy's turn to ask if it'd be a good idea to make == and friends polymorphic in order to get rid of eq etc. It was chromatic's turn to say that no, it's not a good idea, and haven't we done this one countless times before? It was your summarizer's turn to have had a foul couple of days before he finally knuckled down to writing the summary.

http://groups.google.com/groups?selm=20040517203509.GA18989@larroy.com

Idiom for filling a counting hash

Stéphane Payrard wondered Perl 6 might have a neater way of populating a counting hash than

    $a{$_}++ for @a;

He proposed some alternatives which (to my biased eyes at least) looked rather more opaque than the straightforward Perl 5 idiom. John Williams suggested that %a{@a}»++ might fit the bill, but it seems that ++«%a{@a} is more likely to be the right thing.

I think this may be one of those Shibboleth things. If you like Perl 6, then ++«%a{@a} is the kind of thing that could well make you like it more. If you don't like Perl 6, it's the kind of thing that makes you double the length of the pole with which you will not touch the language.

http://groups.google.com/groups?selm=20040518211430.GD27986@stefp.dyndns.org

Simon Cozens has RSI

Simon Cozens' auto responder informed us all that he's suffering from RSI. I'm sure I'm not alone in wishing him a speedy recovery (or at least, that he finds some way of alleviating his problem).

Announcements, Apologies, Acknowledgements

Many apologies for the staggering lateness of this week's summary (at least a week late). Hopefully I'll be back on track next week.

If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send me feedback at mailto:pdcawley@bofh.org.uk

http://donate.perl-foundation.org/ -- The Perl Foundation

http://dev.perl.org/perl6/ -- Perl 6 Development site

An Interview with Allison Randal

This week, perl.com has the pleasure of interviewing Allison Randal, one of the key figures in the Perl community. Allison has been active in the Perl 6 design process since its inception, and is the President of the Perl Foundation. Let's hear more from Allison about what all of this means to her.

perl.com: To begin with, please tell us a little about yourself.

Allison Randal: I'm a Perl programmer from Portland, Oregon. I'm the president of the Perl Foundation, project manager for Perl 6 (as well as a code and design contributor), and run a small consulting company. I lead a rather hermit-like life. In the rare moments when I'm not working on some project or another, you're far more likely to find me reading or puttering in the garden than anything else.

pc: For those who don't know, what is the Perl Foundation and what does it do?

AR: The Perl Foundation is a supporting structure for Perl development and the legal entity behind Perl. It is the copyright holder for Perl 6 and Parrot, and carries the legal responsibility for Larry Wall's copyright in Perl 5.

TPF is the central organization for perl.org, Perl Mongers, and Perl Monks. Through perl.org we support the mailing lists, bug tracking, source control, and web sites critical to Perl development. Perl.org has had several significant upgrades over the past year thanks to generous donations of hardware.

Through Perl Mongers and Perl Monks, we support grassroots involvement in Perl. Perl has always been community-centered, and the face-to-face and online interaction of these two groups play a big part in keeping the community strong and active. TPF also gives the yearly White Camel awards, recognizing significant contributions from individuals in the Perl community.

TPF directly funds Perl development through our quarterly grants. These grants range in size from $500 or less, for a particular project, up to full funding for a year of Perl development work.

pc: And what do you do, as President of TPF, from day to day?

AR: I spend 10 to 20 hours a week making sure the plates keep spinning. Sometimes this is as simple as sending out reminder messages. More often, it means pitching in on whatever odd task needs to be done. I've been known to: write press releases, work on the web site, set up and man TPF booths at conferences, edit grant proposals, organize votes, coordinate with donors, talk to lawyers, talk to artists, and work with companies on licensing arrangements, among other things.

pc: That sounds like a lot! Aren't there people who can help out?

AR: More and more, the responsibilities that have traditionally belonged to the president are moving to other shoulders. As TPF continues to grow, we need a coordinated team to handle the load. In addition to the familiar faces, we've added several great new volunteers this year: Larry Hixon on donor relations, Gavin Estey on public relations, and Baden Hughes as secretary of the grants committee. The complete list is up on the "Who's Who" page at perlfoundation.org.

pc: You are on other committees of the Perl Foundation, such as the steering committee; what's the day-to-day role of that?

AR: The steering committee is responsible for the daily operations of the foundation. Except for some legal and financial decisions, the steering committee has the same powers as the board of directors.

pc: There's also a YAPC committee. What's TPF's role in the many YAPCs that seem to be springing up around the world?

AR: TPF directly runs the North American YAPC. The rest of the worldwide YAPCs are largely independent. We offer assistance and seed funding where we can, but so far, the conferences have all donated back more funds than we've given.

pc: We used to talk about "Yet Another Society," but now much of the community impetus comes from "The Perl Foundation." What's the relationship between these two groups?

AR: "The Perl Foundation" and "Yet Another Society" are really just two names for the same organization. These days we mainly use the name "The Perl Foundation," because it does a better job of explaining what we do.

pc: Tell us a bit more about the grants process -- who can apply, how grantees are chosen, what sort of projects TPF is looking to support, and so on.

AR: Anyone can submit a proposal. The details are up on the "How to Write a Proposal" page at perlfoundation.org. Proposals are reviewed by the grants committee every quarter. We tend to focus on core Perl development, but other Perl projects are also welcome.

pc: Where do TPF's resources come from?

AR: TPF has a small number of big corporate sponsors, a larger number of medium-sized corporations, and an even larger number of individual sponsors, so the resources we get from each group is about equal. It's easy for individuals to think "I'm just one person, I can't make any difference," but if every Perl user around the world donated $25, we could fund Larry full-time for several years. Really, even the corporate donations ultimately start with individuals, because someone took the time to think "Hey, our company has gained so much from Perl, let's give something back."

pc: Turning to Perl 6, what's TPF's role in supporting the Perl 6 project?

AR: We try to keep a good balance between Perl 5 and Perl 6 projects. The most significant way we support Perl 6 is by funding Larry Wall to work on Perl 6 design. We were able to fund him for a full year in 2002, half of the year in 2003, and hope to fund him for a full year again in 2004. We also funded Damian Conway in 2001 and 2002, and Dan Sugalski for half of the year in 2002.

The Ponie project (Perl 5 on Parrot) is part of the larger Perl 6 project. It runs on developer time donated to the Perl Foundation by Fotango.

pc: What does it mean to be the Perl 6 project manager?

AR: Project management is all about making sure people have what they need to keep moving forward. It's not about giving orders, it's about listening for where people need help. It's about encouraging people to do the best job they can with the available resources, without stressing about what they can't do.

We have a great team: intelligent, motivated, and profoundly funny. You might expect ego wars in a team this brilliant, but they simply don't exist. It's a joy to work with this group.

pc: What goes on inside the Perl 6 design process -- between the discussions on perl6-language and an Apocalypse or Exegesis being produced?

AR: The design process is a constant stream of ideas bouncing around, some accepted, some rejected, some reshaped. It reminds me of grad school. You don't just forget the ideas once you leave the classroom -- you eat, drink, and breathe ideas every moment. Aside from messages on and off the lists, the design team meets in a conference call every week, and face-to-face whenever we can. The next design meetings will be in late July.

pc: How is the Perl 6 process running, from your point of view?

AR: It's quite healthy. The key to completing a marathon is simple: just keep moving. It's the same in software, especially in a project this size. Move too fast and you burn out your key players; move too slow and you lose momentum. We've set a nice maintainable pace and just keep rolling.

Luke Palmer is working on a first draft of Synopsis 12 now. Damian will start work on Exegesis 12 during his summer tour to the U.S. Larry is taking a bit of a breather and pondering which Apocalypse to work on next. His next step may be to work on Synopses as previews for the remaining Apocalypses to speed implementation work. Parrot has just recently taken a huge leap forward in objects and Unicode support. Dan currently has his eye on Events.

pc: Finally, when do you think that a complete Perl 6 beta will be available?

AR: That's a tough question to answer, because there are so many factors that could delay or speed up the process. With the state of Parrot and the design work completed so far, though, I'd say there's a good chance we'll see one within the next two years.

This Week on Perl 6, Week Ending 2004-05-09

Ooh look. Stuff's been happening in perl6-internals again. Will wonders never cease?

Building NCI by default

Bernhard Schmalhofer posted a patch to turn on building libnci.so by default so that the tests in t/pmc/nci.t would get run on more builds by default. Leo Tötsch applied them and sat back to see what broke. He, Nicholas Clark and Andrew Dougherty went on to have a discussion about when dynamic loading may not be available (Did you know that Crays don't do dynamic loading?) or desirable.

Various OSes broke, and appropriate patches were created and applied.

http://groups.google.com/groups?selm=rt-3.0.9-29257-86491.15.10015674031@perl.org

MMD and bytecode implementations

There was some debate on how register saving should be handled when doing multi method dispatch to operators that end up with an implementation in bytecode. I was involved in this discussion and I'm still busy arguing my corner. So, rather than abuse this particular soapbox with a partial (as in 'not impartial') summary, I'm just going to point you at the root message of the thread.

http://groups.google.com/groups?selm=a06100502bcb7f7775cfc@[10.0.1.3]

Embedding and the stack

Dan ruled on PMCs that live outside parrot, and on calling into parrot from the outside. Brent Royal-Gordon implemented a 'proof of concept' that got applied pretty quickly. Nicholas Clark wasn't entirely convinced that everything in the garden was lovely, but we're currently in 'suck it and see' mode.

http://groups.google.com/groups?selm=a06100501bcbc0a646eeb@[10.0.1.3]

The ongoing saga of cygwin

Cygwin had all sorts of problems this week, with ICU, with NCI and (no surprise really) garbage collection. Joshua Gatcomb acted as Leo's eyes and hands during an extended bug hunt which ended with all tests passing and parrot running pleasingly quickly on the cygwin platform.

http://groups.google.com/groups?selm=a06100503bcbc137c9093@[10.0.1.3]

http://groups.google.com/groups?selm=3hdzmvl8344r$.1hkuxsnjge3es.dlg@40tude.net

http://groups.google.com/groups?selm=20040504135510.85073.qmail@web60801.mail.yahoo.com

http://groups.google.com/groups?selm=20040506162006.90350.qmail@web60806.mail.yahoo.com

http://groups.google.com/groups?selm=20040507160212.6232.qmail@web60802.mail.yahoo.com

http://groups.google.com/groups?selm=20040507161245.99916.qmail@web60810.mail.yahoo.com

NCI Nested struct access broken

Leo on chromatic tracked down a bug with accessing nested structs when using the Native Call Interface.

http://groups.google.com/groups?selm=rt-3.0.9-29333-86684.19.0522834728947@perl.org

Async IO notes

Dan dropped a few notes on how Parrot's asynchronous IO was going to work.

http://groups.google.com/groups?selm=a06100502bcbd6e4574bc@[172.24.18.98]

The linker TODO

In another of his TODO tasks for the interested, Dan asked for someone to tidy up the parrot link stages so that the various parrot libraries were rather less promiscuous about exporting symbols. The problem's not solved yet, but it's being worked on.

http://groups.google.com/groups?selm=a06100500bcbd3cf3e5d8@[10.0.1.3]

Towards a final events design

Dan's working on the design Parrot's event and IO system and he had a question about dealing with external events. Discussion ensued. Some of it related to Dan's question.

http://groups.google.com/groups?selm=a06100501bcbd45b4f30c@[10.0.1.3]

Return Continuations

Dan was about to implement the previously discussed return continuation register, but Leo argued that it should be done gradually and, I think, carried the day.

http://groups.google.com/groups?selm=a06100502bcbec5bb0498@[172.24.18.98]

NCI sub call syntax in PIR

Leo's rejigged IMCC so that making an NCI call is a little easier.

http://groups.google.com/groups?selm=4099E558.7090302@toetsch.at

Documentation on writing PMCs

Nicholas Clark asked after documentation on how to write a PMC. It turns out there isn't any in the distribution, but Mike Scott has some useful stuff on his Wiki. It seems that Real Men use the source. Hopefully this will change eventually (soon?).

http://groups.google.com/groups?selm=20040507160319.GF7052@plum.flirble.org

http://www.vendian.org/parrot/wiki/bin/view.cgi/Main/ParrotDiagramsPMC

Ponie's performing pants

Nicholas Clark reported that, now that ponie is using PMCs to handle Perl scalars, Ponie's performance is now 'pants'. It turned out to be a problem with Garbage Collection. The initial fix gave a massive performance increase, but there are still issues to address.

http://groups.google.com/groups?selm=20040507161600.GM7078@plum.flirble.org

Meanwhile, in perl6-language

Returning from rules

Larry ruled on Luke Palmer's rule return proposal. Essentially the proposal as it stood was rejected in favour of a possible new succeed keyword/macro.

http://groups.google.com/groups?selm=20040419070629.GA32185%luke@luqui.org

Required named parameters never go away do they?

Dov Wasserman chimed in on the subject of required named parameters. Larry was unconvinced by his proposal.

http://groups.google.com/groups?selm=20040504085308.43740.qmail@onion.perl.org

Specifying class interfaces with AUTOMETH

Austin Hastings had some questions about AUTOMETH and AUTOMETHDEF. It appears from Larry's answers that Austin had the wrong end of that particular stick.

http://groups.google.com/groups?selm=ICELKKFHGNOHCNCCCBKFEEEGCMAA.Austin_Hastings@Yahoo.com

Named parameters vs. slurpy hash syntax

Dov Wasserman worries that named parameters and slurpy hash arguments don't get on very well. Others disagreed.

http://groups.google.com/groups?selm=20040506023628.60840.qmail@onion.perl.org

is rw trait's effect on signature

Aaron Sherman pointed to an issue with the is rw trait and wondered if he was making a fool of himself. For some reason the responses turned into a discussion of the difference between 'isa' and 'does' relationships. In response to Aaron's question, Dan implied that Aaron was expecting rather too much to happen at compile time.

http://groups.google.com/groups?selm=1083865176.4985.88.camel@pps

Apologies, Announcements, Acknowledgements

If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send me feedback at mailto:pdcawley@bofh.org.uk

http://donate.perl-foundation.org/ -- The Perl Foundation

http://dev.perl.org/perl6/ -- Perl 6 Development site

Affrus: An OS X Perl IDE

When I last reviewed a Perl IDE, ActiveState's Komodo, I was nearly convinced; the only problem was that I use Mac OS X. Now, Late Night Software, more commonly known for their AppleScript tools, have taken their Mac programming experience and applied it to create Affrus, a Perl IDE for the Mac. And I'm a little closer to being convinced.

Affrus differs from Komodo in some substantial ways. Where Komodo couples its editor tightly with a Perl interpreter to allow background syntax checking and on-the-fly warnings highlighting, Affrus takes a more traditional, detached approach: syntax checks are performed on demand, with errors and warnings placed in a separate panel. Fans of emacs's debugging modes will be happier to see this:

It took me quite awhile to discover the control-click contextual menu -- since the downside of "intuitive" applications is that we don't get a nice manual to read any more -- but when I did I was amazingly impressed. Right-clicking on a package name does just what you want; it allows you to edit that package's file, or to view its documentation with perldoc. Similarly, right-clicking on a built-in offers to bring up perldoc for that function.

Right-clicking on the name of a subroutine lets you navigate to the definition of that routine -- even doing a remarkably good job at working out what class a method will come from. And right-clicking on a variable name takes you to where that variable was declared. Right-clicking on the navigation bar at the bottom of the window brings up a "table of contents" for the program, allowing you to navigate to any of the modules it uses and any of the subroutines it defines. If you right-click on empty space, however, you get a listing of variables and subroutine names that can be inserted at that location. Full marks for this, and the more time I spend with the Affrus editor, the more neat things like this I find.

On the whole, though, the Affrus editor is relatively basic. While its syntax highlighting is more sophisticated than most, distinguishing between package, lexical, and special variables, for instance, it does not handle code folding, nor does it have "smart" auto-indenting. It's quite comparable to the original emacs perl-mode. However, this isn't necessarily a problem, due to Affrus' integration with other editors such as BBEdit and TextWrangler; additionally, Late Night's AppleScript experience has enabled them to design Affrus to be extensible and scriptable. Script plugins provided with Affrus allow it to reformat code with perltidy, as well as to insert control structures and other snippets into the current file.

As well as its scriptability, the real boon in Affrus is in its debugging console; on top of the usual debugging actions of stepping over a script, jumping in and out of subroutines, setting breakpoints, and so on, it presents at every step a detailed listing of all the variables in play, allowing you to look inside complex data structures with OS X's familiar disclosure triangles:

As one would expect, it automatically loads up modules and other external Perl code while debugging, allowing you to step over their code, too. You can also change the value of variables during debugging, as well as enter arbitrary Perl expressions in the "Expressions" pane.

Affrus offers a few other interesting little features, such as the debugger's ability to detect and highlight circular references, and the bundled command line tool. This utility enables you to debug a Perl program in Affrus while having complete control over the environment and standard IO redirections -- a major bridge between GUI-based debugging and the "real world" of complex Perl program deployments.

There are a few things Affrus doesn't do which I'd like, but to be honest they're a part of the way I use Perl -- an IDE with integrated debugger and Perl-aware editor is a great environment for creating standalone Perl scripts where you're running through a process, breaking at significant moments, inspecting the control flow and the state of the variables. However, when you work primarily at the level of Perl modules and Apache handlers, there is no real top-level "process" to step through, and a traditional debugging environment becomes much less useful.

That said, in such a debugging environment, I'd love to see Affrus have a Perl debugger pane at which one could execute Perl code during a debugger run; inspecting the variables is great, but there ought to be a way to change them, too! There are other changes I'd like to see in the future, ranging from something as trivial as a color scheme palette -- the first thing I did on running Affrus was to spend 10 minutes configuring it with a set of colors that look nice on a black background instead of a white one! -- to full-blown integration with either CVS or even Apple's Xcode IDE.

On the whole, however, I'm very impressed by Affrus, and I'm convinced that, even if it's not to your taste yet, it is certain to grow into a mature and powerful Perl IDE for OS X.

This Week on Perl 6, Week Ending 2004-05-02

So, May Day didn't quite knock me for six this year (but being up at 4am on Newcastle Town Moor on Saturday morning to welcome in the summer with a bunch of rapper dancers is guaranteed to leave a chap feeling rather tired) so here's the summary.

We'll kick off with perl6-internals. Again.

GCC for Parrot

Vishal Vatsa announced that he was working on getting GCC to compile to Parrot as his masters research project. I'm sure we're all wishing him the best of luck.

http://groups.google.com/groups?selm=200404291641.55582.vishal.vatsa@may.ie

Parrot strings

Various discussions of the Right Way to do string handling continued. Dan clarified his Strings design docs and Jeff Clites posted a 'Strings Manifesto'.

It seems, to this observer at least, that a big problem with dealing with strings is that there's rather more than one Right Way to do things, depending on how you're looking at the problem and what you're trying to do. Dan's problem is that he has to find some way of accommodating all the various contradictory things that people want to do to strings without making any of them unduly painful. (Not assisted by the fact that, in one last week's documents he managed to graphemes, characters and glyphs slightly mixed up.)

Me? I'm punting on summarizing this properly. My theory is that if you know enough about this stuff to be of assistance then anything I say will almost certainly be an annoying simplification too far. And if you both know enough about strings and care enough about Parrot, you're already involved in the thread. Hopefully there'll come a point in the future when everything's settled and this section of the summary will say "Strings got sorted at last, look at this URL for details".

http://groups.google.com/groups?selm=1082943293.6126.31.camel@wakko

http://groups.google.com/groups?selm=8DE3A940-9869-11D8-88D8-000393A6B9DA@mac.com

http://groups.google.com/groups?selm=488F0B8B-9906-11D8-88D8-000393A6B9DA@mac.com -- Strings Manifesto

http://groups.google.com/groups?selm=FE6307104985D611A92A0002A5F3AB00B9D90F@exch1.Sterling.COM

http://groups.google.com/groups?selm=675F9A7A-9ABC-11D8-88D8-000393A6B9DA@mac.com

Win32 and cygwin issues

There's been a good deal of work this week on getting things building properly in the Win32 and cygwin environments. Ron Blaschke, 'Limbic Region', George R and Leo Tötsch all worked on things.

http://groups.google.com/groups?selm=1w763i4oy4k22.1q6kzorxl9k43$.dlg@40tude.net

http://groups.google.com/groups?selm=20040426124840.81742.qmail@onion.perl.org

NCI and 'Out' parameters

Mr ParrotSDL (aka chromatic) showed an SDL function signature that he wanted to use via NCI (Parrot's Native Call Interface), but which didn't work because it had two pointer arguments that were used by the function as out parameters. He proposed changing the NCI PDD (PDD16) to support this sort of thing cleanly. Leo, Dan, Tim Bunce and chromatic discussed the changes needed.

A little later in the week, chromatic posted a patch implementing something to solve the problem.

http://groups.google.com/groups?selm=rt-3.0.8-29200-86258.1.8350349802563@perl.org

http://groups.google.com/groups?selm=rt-3.0.9-29261-86499.1.95148647858417@perl.org

Joys of assignment

There's been some discussion recently about what happens with assignment and/or binding, and the difference between value and reference types. Dan decided that the time was right to post a clarification of the issues as they related to Parrot.

http://groups.google.com/groups?selm=a06100503bcb2b791bb31@[10.0.1.2]

Fun with md5sum

Nick Glencross's idea of fun is, apparently, implementing md5sum in IMCC. He posted his first cut at an implementation. I think Leo's response holds for all of us: "Wow". Maybe I'm looking in all the wrong places, but so far it doesn't appear to be in the repository, but it can only be a matter of time.

http://groups.google.com/groups?selm=408E7518.3040204@glencros.demon.co.uk

Return continuation register

Dan's working through outstanding issues and tidying up loose ends. One result of this is that he's been thinking about how the return continuation is handled. The new design (and calling conventions?) makes use of a return register in the interpreter structure, which gets saved as part of the environment that a continuation captures.

http://groups.google.com/groups?selm=a06100504bcb5741843e2@[10.0.1.3]

Library loading

Clearing up another loose end, Dan did some design of how we're going to cope with loading libraries.

http://groups.google.com/groups?selm=a06100506bcb5783a3bf6@[10.0.1.3]

Keyed vtables and MMD

If you've been following the mailing list for any length of time (especially if you've been following it directly rather than summarized), you'll be aware that there's been a long running... discussion between Leo and Dan about the keyed variants of all the binary vtable entries: Leo worries that there's a heck of a lot of 'em; Dan worries that it means creating temporary PMCs in what should be simple circumstances.

It turns out that having *all* the keyed variants does rather get in the way of multi-method dispatch. So Dan's mandated that get rid of the keyed variants for everything but get and set and move all the operator functions out of the vtable and into the Multi method dispatch system. He asked for comments...

It turns out that moving to MMD everywhere doesn't seem to have any deleterious effect on performance, which is nice. In fact, in some cases it's faster.

http://groups.google.com/groups?selm=a06100507bcb57c1f2582@[10.0.1.3]

File stat info

Tying up another loose end, Dan specced out how stat is going to work. It was modified (slimmed down rather a lot) after discussion.

http://groups.google.com/groups?selm=a06100508bcb580d740bf@[10.0.1.3]

One more thing

Dan noted that, by going MMD all the way, it means we can skip the bytecode->C->bytecode transition for MMD functions that are written in Parrot bytecode and just dispatch to them as if they were any other sub.

http://groups.google.com/groups?selm=a06100514bcb59e5f2c78@[10.0.1.3]

MMD table setup semantics

Dan opened a discussion of what should happen when we add things to a Multimethod dispatch table. Discussion followed.

http://groups.google.com/groups?selm=a0610051fbcb5ccc30c12@[10.0.1.3]

Double-checking compiler function parameters

Dan thought allowed about what a compiler module should look like and asked for comments. Leo and Stefan Lidman came through with some.

http://groups.google.com/groups?selm=a06100500bcb5f513ba06@[172.24.18.98]

Pointer stores and DOD

Dan reckons the time has come to make sure that all stores of pointers to DOD-able structures into DOD-able places are done with a mediating function (or macro). The idea is that doing this will allow us to experiment with other Garbage Collection techniques without having to change vast amounts of code.

http://groups.google.com/groups?selm=a06100507bcb6b53a9126@[172.24.18.98]

TODO: Forth as a compiler module

Dan laid down a TODO challenge: Your mission, should you choose to accept it, is to take languages/forth/forth.pasm into a loadable compiler module that you can compile workable forth code with. Go to it!

http://groups.google.com/groups?selm=a0610050abcb6c35b983c@[172.24.18.98]

Outstanding parrot issues

Roll up! Roll up! Come and see the Ponie trainer and the Patchmonster engage in a free and frank exchange of views!

Well, maybe not quite that extreme. Nicholas Clark, Arthur Bergman and Leo Tötsch have had a long standing disagreement about how Parrot's embedding works.

Because of the weekend falling when it did, you'll get the resolution next week.

http://groups.google.com/groups?selm=200404301154.i3UBswu13339@thu8.leo.home

Meanwhile, in perl6-language

A12 Versioning

Richard Proctor isn't happy with the A12 versioning proposal, and he said as much, outlining what he didn't like about it. Aldo Calpini had some further comments. Larry came through with answers.

http://groups.google.com/groups?selm=Marcel-1.53-0426142012-9eeRr9i@waveney.org

Compatibility with Perl 5

The discussion of how to write identify Perl 6 and Perl 5 scripts continued. Larry suggested moving to using : as the command line switch in Perl 6, so a minimal Perl 6 marker would be something like:

    #!/usr/bin/perl :
    #!/usr/bin/perl ::
    #!/usr/bin/perl :6

Jonathan Scott Duff doesn't like using shifted characters to for commandline switches, so he suggested using = instead. Other options were suggested...

http://groups.google.com/groups?selm=20040426174457.GA1417@wall.org

On inheriting wrappers

Aldo Calpini had some questions about using wrappers with methods, and whether wrappers got inherited. I must confess I didn't quite follow the discussion, but Larry seemed to.

http://groups.google.com/groups?selm=1083316495.3463.158.camel@localhost

Hey look everyone, p6stories!

Now that we have Apocalypses 1-6 and 12, that could be seen as having most of the language done. So chromatic pointed everyone at the P6 stories Wiki and suggested that they join the effort of breaking the apocalypses up into stories and (hopefully) test cases, to help make the job of writing Perl 6 itself much easier.

http://p6stories.kwiki.org/

How to parameterize roles

Austin Hastings wondered how/if he could implement a parameterized role. Luke Palmer had some answers, and some unsolicited (but rather useful) design advice. Aaron Sherman also had some design advice.

http://groups.google.com/groups?selm=ICELKKFHGNOHCNCCCBKFCECFCMAA.Austin_Hastings@Yahoo.com

Announcements, Acknowledgements, Apologies

If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send me feedback at mailto:pdcawley@bofh.org.uk

http://donate.perl-foundation.org/ -- The Perl Foundation

http://dev.perl.org/perl6/ -- Perl 6 Development site

Building Testing Libraries

Testing is an important step in developing any important body of work. In today's pragmatic culture, we're taught to test first, test often, and design with tests. The expectation is that chanting "test test test" forgives all sins. To a large extent, this is true. Testing helps us produce quality software at all scales.

The extreme code produced by this extreme lifestyle hides in the test suite itself. Often the ugliest code we write resides in files with a .t extension. Riddled with redundant, ghastly expressions, the test suite is the collateral damage on our road to beautiful production code.

Let's review some common pitfalls made when testing. Many of these testing procedures may be new to you. Serious headway has been made in recent history with the testing libraries on the CPAN.

A Test File is Just a Program

Each test file is a program, just as important as any other program you'd write that uses software being tested. It must be treated with the same care. If you plan to use strict and warnings in a program related to the code you're testing, be sure to do the same in your tests.

Each test file should start with these three lines.

  #!/path/to/perl
  use strict;
  use warnings;

If you plan to run your software in a taint-checked environment, which is considered a good idea, then supply the -T command-line option to the #! line.

  #!/path/to/perl -T

This will ensure that you won't make syntactic mistakes in your test files. It will also require your software to work correctly in a restricted environment.

Be Compatible with Test::Harness

Test::Harness is a very useful Perl module for running test suites. If you are building a Perl module yourself, and using ExtUtils::MakeMaker or Module::Build for the build process, you're using Test::Harness. If you aren't using any of these mechanisms, do try to be compatible with it. This will help other users and developers of your software who are used to dealing with Test::Harness.

Compatibility comes in the form of the test file's output. Test::Harness will run your program and record its output to STDOUT. Anything sent to STDERR is ignored, silently passed on to the user. There are particulars about testing under the harness that should be observed. The basics are simple.

When a test passes, it outputs a line containing ok $N, where $N is the test number. When a test fails, the line contains not ok $N. Test numbers are optional but recommended. Tests may be named. Anything after the number, $N, is considered the test name, up to a hash (#). Anything following the hash is a comment.

Furthermore, you are encouraged to supply a header. The header tells Test::Harness how many tests you expect to run and should be the first thing you output. If you're unsure of the number of tests, the header may be the very last thing output. Its format is also simple: 1..$M, where $M is the total number of tests to run. The header helps the harness figure out how well your tests did.

Any other output should be commented on lines beginning with a hash (#). Here is an example of prototypical output understood by Test::Harness.

  1..4
  ok 1 - use Software::Module
  ok 2 - object isa Software::Module
  not ok 3 - $object->true() should return true
  #     Failed test (test.t)
  #          got: undef
  #     expected: 1
  ok 4 # skip Net::DNS required for this test
  # Looks like you failed 1 tests of 4.

Use a Testing Module

A simple way to achieve Test::Harness compatibility is to use a testing module from the CPAN. Many test suites over the years have reinvented the ok() function, for example.

  {
    my $N = 1;
    sub ok($;$) {
        my ($test, $name) = @_;
        print "not " unless $test;
        print "ok $N - $name\n";
        $N++;
    }
  }

There is no need to do this, however. The standard Perl distribution comes with testing modules. Two great options are Test::Simple and Test::More. Test::Simple is a great way to get your feet wet; it implements only the ok() function. Test::More has more features and is recommended when you write your test suites.

Using Test::More is very simple; many have written on the subject. This is how you would achieve the output described in the previous section.

  #!/usr/bin/perl -T
  use strict;
  use warnings;
  use Test::More tests => 4;
  
  use_ok 'Software::Module';
  my $object = Software::Module->new;
  isa_ok $object, 'Software::Module', 'object';
  cmp_ok $object->true, 1, '$object->true() should return true';
  
  SKIP: {
      skip 1, "Net::DNS required for this test"
        unless eval 'require Net::DNS';
      
      ok $object->network(), "run over network";
  }

Don't Iterate, Compare

I've often seen tests that loop over a list and check each item to be sure the list is correct. While this approach makes you feel good, artificially adding to the number of tests you've written, it can be sloppy and long-winded. Here is an example.

  my @fruits = qw[apples oranges grapes];
  my @result = get_fruits();
  foreach my $n ( 0 .. $#fruits ) {
      is $result[$n], $fruits[$n], "$fruits[$n] in slot $n";
  }
  is scalar(@result), scalar(@fruits), "fruits the same size";

It looks like four tests were written; the reality is that one test was written poorly. Test::More has several utility functions to get the job done. In this test, @fruits represents a set of non-repeatable fruits I expect to get back from get_fruits(). As such, I can use eq_set() to test this function in one quick try.

  my @fruits = qw[apples oranges grapes];
  my @result = get_fruits();
  ok eq_set(\@result, \@fruits), "got [@fruits]";

That was easy and short. But what happens when you have a deep data structure that you're dying to test? That's where Test::Deep comes in. Downloadable from the CPAN, this module provides the cmp_deeply() function. Here is a simple example.

  use Test::Deep;
  my $people = [
    {
      name     => "Casey West",
      employer => "pair Networks",
    },
    {
      name     => "Larry Wall",
      employer => "The Perl Foundation",
    },
  ];
  
  my $result = $dude->contacts->retrieve_all;
  
  cmp_deeply $result, $people, 'contacts match';

This example scratched the surface of what Test::Deep is capable of. When you've got to test a complex data structure, especially in a complex way, use this module. Here is a more difficult example made testable by this module. In this example, $dude->contacts->retrieve_all returns an unordered list of contacts with various bits of information associated with each of them.

  use Test::Deep;
  my $person = {
    name     => re("^[\w\s]+$"),
    employer => ignore(),
    age      => code(sub { shift > 18 }),
  };
  my $people = array_each($person);
  my $result = $dude->contacts->retrieve_all;
  
  cmp_deeply $result, $people, 'contacts match';

This code, using only functions exported by Test::Deep, does a lot of work. Each person has a definition that should match $person. Every person in the $result list is a hash reference containing three elements. name must match the regular expression /^[\w\s]+$/, employer must exist and its value is ignored, and age should be over 18 or it will fail. array_each() returns an object that instructs cmp_deeply that every value in a list must match the definition provided.

This small amount of code accomplishes quite a lot. Test::Deep has saved us from wasting time and working hard to solve a difficult problem. It has made the hard things possible.

Don't Let POD go Unchecked

Documentation is just as important as code, or tests. There are several ways to care for POD in your test suite. First, it's important to keep it well-formed. For this, we turn to Test::Pod. This Perl module takes all the work out of testing POD with a useful function all_pod_files_ok(). Simply create a new test program with the following contents.

  use Test::More;
  plan skip_all => "Test::Pod 1.00 required for testing POD"
    unless eval "use Test::Pod 1.00";
  all_pod_files_ok();

Yes, it really is that simple. When you run this program, it will test all the POD it finds in your blib directory.

Another simple test we can run on the documentation is coverage analysis. What good is documentation if it doesn't document completely? Test::Pod::Coverage is the right module for the job, yet another gem that hides all the hard work from us with a simple function, all_pod_coverage_ok(). Again, we'll create a new test program.

  use Test::More;
  plan skip_all => "Test::Pod::Coverage 1.08 required for testing POD coverage"
    unless eval "use Test::Pod::Coverage 1.08";
  all_pod_coverage_ok();

Coverage is only half of the battle. Remember, Test::Pod::Coverage can't tell you if your documentation is actually correct and thorough.

In both of these examples, we use the plan function exported from Test::More to allow us to "bail out" of our tests if the appropriate Perl module isn't installed. This makes our POD tests optional. If you don't want them to be optional, remove that line and be sure to list them as prerequisites for building and installing your software.

Know What You're Testing

One of the biggest testing mistakes is to assume that you know what you're testing. Tests are designed to exercise your software. Let your test exercise the good and bad portions of your software. Make it succeed and, most importantly, make it fail. Superior test coverage digs deep into every line of code you've written. How do you know if your tests are amazing? Coverage analysis.

Code coverage isn't something you can guess; you need good tools. Perl has a good tool: Devel::Cover. This module creates a database that maps actual execution to your source code. It analyzes statements, branches, conditions, subroutines, and even POD and execution time. It then provides a total for all of these areas, as well as a total for each Perl module. It's very simple to use, adding just a little to your make test process.

  > cover -delete
  > HARNESS_PERL_SWITCHES=-MDevel::Cover make test

The first command deletes any existing coverage database. On the second line we set an environment variable for Test::Harness, HARNESS_PERL_SWITCHES to a Perl command-line switch that imports Devel::Caller. This is all that's required of you. Each of your test programs will now run with Devel::Caller loaded and analyzing execution in the background.

To see your coverage database on the command line, issue one command.

  > cover
  ---------------------------- ------ ------ ------ ------ ------ ------ ------
  File                           stmt branch   cond    sub    pod   time  total
  ---------------------------- ------ ------ ------ ------ ------ ------ ------
  blib/lib/List/Group.pm         94.7   66.7   33.3  100.0  100.0  100.0   81.6
  Total                          94.7   66.7   33.3  100.0  100.0  100.0   81.6
  ---------------------------- ------ ------ ------ ------ ------ ------ ------

  Writing HTML output to ~/cvs/perl/modules/List-Group/cover_db/coverage.html ...
  done.

As you can see, I could've done better. But what did I fail to test? Notice that cover wrote some HTML output. That is the diamond in the rough; the HTML output details everything. Each module has its own series of web pages detailing each of the coverage groups. I did particularly poorly on the conditional coverage -- let's see how.

Now it's become clear. My tests never allow either of the two statements in this condition to succeed. All of my tests make the first statement fail; the second is never executed. I need to update my tests with at least two more for 100.0 conditional coverage. The first test will supply a non-number for the $number variable. The second will supply a value for the $group_by variable that doesn't exist in the list for which grep is looking.

Testing for coverage is a noble goal. I find this method very useful when writing tests for existing software. There are many situations you may think you're testing well. Don't guess; know. Coverage analysis is equally useful for new development. If you've adopted the "test first" method and your coverage isn't 100 percent, something is wrong. Either your tests need help, or you've written more code than originally required.

Keep Test Files Organized

Perl software distributions follow several widely adopted guidelines concerning tests. The rules are simple: test files should reside in a t/ directory, and each test file ends in a .t extension. Test::Harness understands these rules and make test will run every file that abides by them.

The filename can be anything you like. It's a good idea to use descriptive filenames instead of just digits or numerical words. Good examples are pod-coverage.t, software-class-api.t, and compile.t. Sometimes it's desirable to determine the order in which your test files will be run. In these cases, prefix the filename with a number. If you want compilation tests to run first and POD tests last, name them accordingly as 00-compile.t and 99-pod-coverage.t.

Looking Ahead

Testing can be a tedious, difficult job. By this point, you have a number of helpful tools to make the task easier. There are many more testing modules on the CPAN that could have been covered here; I encourage you to explore them all.

Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en