August 2003 Archives

This week on Perl 6, week ending 2003-08-31

Welcome to this week's Perl 6 summary. This week, for one week only I'm going to break with a long established summary tradition. No, that doesn't mean I won't be mentioning Leon Brocard this week. Nope, this week we're going to start by discussing what's been said on the perl6-language list this week.

Now that's out of the way, we'll continue with a summary of the internals list.

Continuation Passing is becoming the One True Calling Style

Jos Visser had some code that broke in an interesting fashion when find_lex through an exception, so he asked the list about it. Leo Tötsch explained that exceptions and old style subs don't play at all well together. It seems to me that the total and utter deprecation of subs using the old style calling conventions is not far distant.

http://groups.google.com/groups

Embedding Parrot in Perl

Luke Palmer has started to learn XS in the service of his project to embed Parrot in Perl. Unsurprisingly, he had a few questions. He got a few answers.

http://groups.google.com/groups

Implementing ISA

Leo Tötsch has implemented isa. The unfashionably lowercased chromatic argued that what Leo had implemented should actually be called does. Chris Dutton thought does should be an alias for has. Piers Cawley thinks he might be missing something.

http://groups.google.com/groups

More on constant PMCs and classes

Leo Tötsch's RFC on constant PMCs and classes from last week continued to attract comments about possible interfaces and implementations.

http://groups.google.com/groups

A miscellany of newbie questions

James Michael DuPont a bunch of questions and suggestions about bytecode emission, the JIT and about possibly extracting the Parrot object system into a separate library. Leo supplied answers.

http://groups.google.com/groups

What the heck is active data?

Dan clarified what he'd meant when he talked about Active Data. His one sentence definition being '``Active Data'' is data that takes some active role in its use -- reading, writing, modifying, deleting or using in some activity'. The consequences of such data are far reaching; even if your code has no active data in it, Dan points out that you still have to take the possibility into account, or solve the halting problem.

Benjamin Goldberg seemed to think that you didn't need to solve the halting problem, you could just add scads of metadata to everything and do dataflow analysis at compile time. I look forward with interest to his implementation.

Matt Fowles wondered why active data necessitated keyed variants of all the ops, asking instead why we couldn't have a prepkeyed op to return an appropriate specialized PMC to use in the next op. Dan agreed that such an approach was possible, but not necessarily very efficient. Leo Tötsch disagreed with him though.

TOGoS wondered if this meant that we wouldn't know whether set Px, Py did binding or morphing until runtime. (It doesn't. set always simply copies a pointer). In an IRC conversation with Dan we realised that some of this confusion arises from the fact that set_string and friends behave as if they were called assign_string; to get the expected set_string semantics you'd have to do:

    new Px, .PerlUndef
    set_string Px, "some string"

Hopefully this is going to get fixed.

http://groups.google.com/groups

Mission haiku

Nicholas Clark
To make some kind of mark
Committed haiku.
Don't you.

Yes, I know that's not a haiku. It's a Clerihew. I suggest that anyone else who feels tempted to perpetrate verse on list restrict themselves to a sestina or a villanelle, or maybe a sonnet.

I also note that POD is a lousy format for setting poetry in.

http://groups.google.com/groups

Jürgen gets De-Warnocked

Jürgen Bömmels had been caught on the horns of Warnock's Dilemma over a patch he submitted a while back. It turns out that he'd been Warnocked in part because both Leo and Dan thought he already had commit rights. So that got fixed. Welcome to the ranks of Parrot committers Jürgen, you've deserved it for a while.

http://groups.google.com/groups

Parrot Z-machine

New face Amir Karger wants to write the Parrot Z-machine implementation and had a few questions about stuff. So far he's been Warnocked.

http://groups.google.com/groups

Notifications

Dan described how Parrot's notification system would work, and what that means for weak references. Michael Schwern thought the outlined notification system would also be awfully useful for debugger watch expressions. Tim Bunce worried about some edge cases.

http://groups.google.com/groups

MSVC++ complaints

Vladimir Lipskiy (who's been doing some stellar work recently on various build issues amongst other things) found some problems trying to build Parrot with MSVC++ and asked for help in working out how to fix them. Jürgen Bömmels suggested a fix, which Vladimir liked in principle, but noted that there were still some issues it didn't quite fix.

http://groups.google.com/groups

vtable->dump

Leo Tötsch thought that, if only for debugging, it would be really handy for PMCs to offer a dump method which would return a string representation of the PMC. Dan thought that a better approach would be to get freeze/thaw working for PMCs and have the debugger know how to dump a frozen PMC. This seemed to open up a whole big can of worms as Leo, Dan and others discussed what was needed from the serialization toolset and what its interface should look like.

Nicholas Clark threw a googly down the pitch with his description of a possible attack on serialization schemes (possibly originating with Jonathan Stowe) that seems deeply tricky to work around.

http://groups.google.com/groups

http://groups.google.com/groups

exit opcode

Leo checked in a small change to Parrot, making exit throw an exception rather than simply quitting the program. Of course, unless the exception is caught, parrot will exit anyway. He also proposed changing the startup parameters by moving the ARGV array from P0 to P5 for consistency with the Parrot Calling conventions. For some reason this sparked off an enormous thread discussing how to return from the main function.

Dan liked the the idea, so Leo checked a patch in and fixed up as many of the examples and languages as he could find, but he expects that he hasn't caught 'em all.

http://groups.google.com/groups

Acknowledgements, Announcements, Apologies

I'm really, really sorry about the Clerihew. But not sorry enough to remove it.

Thanks to everyone involved in making sure I only got one and a half of last week's predictions right. (The half prediction was to do with my writing real Perl code, I didn't. But I did release the Paris code, you can find it on CPAN at http://search.cpan.org/search if you're interested.)

Thanks to Gill for seven and 2 days (as I write this) of wedded bliss.

Check out http://pc1.bofhadsl.ftech.co.uk:8080/ for more of my writing (and thanks to those who have already popped by).

As ever, if you've appreciated this summary, please consider one or more of the following options:

Using Perl to Enable the Disabled

We use Perl for all kinds of things. Web development, data munging, system administration, even bioinformatics; most of us have used Perl for one of these situations. A few people use Perl for building end-user applications with graphical user interfaces (GUIs). And as far as I know, only two people in this world use Perl to make life easier for the disabled: Jon Bjornstad and I. Some people think the way we use Perl is something special, but my story will show you that I just did what any other father, capable of writing software, would do for his child.

The Past

In 1995 my eldest daughter, Krista, was born. She came way too early, after a pregnancy of only 27.5 weeks. That premature birth resulted in numerous complications during the first three months of her life. Luckily she survived, but getting pneumonia three times when you can't even breath on your own causes serious asphyxiation, which in turn resulted in severe brain damage. A few months after she left the hospital it became clear that the brain damage had caused a spastic quadriplegia.

As Krista grew older, it became more and more clear what she could, and couldn't do. Being a spastic means you can't move the muscles in your body the way you want them to. Some people can't walk, but can do everything else. In Krista's case, she can't walk, she can't sit, she can't use her hands to grab anything, even keeping her head up is difficult. Speaking is using the muscles in your mouth and throat, so you can imagine that speaking is almost out of the question for her.

By the end of the year 2000, Krista went to a special school in Rotterdam. But going to school without being able to speak or without being able to write down what you want to say is hard, not only for the teacher, but also for the student. We had to find a way to let Krista communicate.

Together with Krista's speech pathologist and orthopedist we started looking for devices she could use to communicate with the outside world. These devices should enable her to choose between symbols, so a word or a sentence could be pronounced. A number of devices were tested, but all of them either required some action with her hands or feet that she wasn't able to perform, or gave her too little choices of words.

Then we looked into available communications software, so she could use an adapted input device (in her case a headrest with built-in switches) to control an application. Indeed there was software available that could have been used, but the best match was a program that automatically scanned through symbols on her screen and when the desired symbol was highlighted, she had to move her head to select it. Timing was the issue here. If moving your head to the left or right is really hard to do anyway, it's hardly possible to take that action at the desired moment.

pVoice

We had to do something. There was no suitable device or software application available. I thought it through and suggested I could try to write a simple application myself. It would be based on the idea of the best match we had found (the automatic scanning software), but this software would have no automatic scanning. Instead, moving to the right with your head would mean "Go to the next item," and moving to the left would mean "Select the highlighted item." That would mean that she would need a lot of time to get to the desired word, but it's better to be slow than not able to select the right words at all.

The symbols would have to be put in categories, so there would be some logic in the vocabulary she'd have on her PC. She started out with categories like "Family," containing photos of some members of the family, "School," containing several activities at school, and "Care," which contained things like "going to the bathroom," "taking a shower," and other phrases like that.

By the end of January 2001 I started programming. In Perl. Maybe Perl isn't the most logical choice for writing GUI applications for disabled people, but Perl is my language of choice. And it turned out to be very suitable for this job! Using Tk I quickly set up a nice looking interface. Win32::Sound (and on Linux the Play command) enabled me to "pronounce" the prerecorded words. Within two weeks time I had a first version of pVoice, as I called this application (and since everyone asks me what the 'p' stands for: 'p' is for Perl). Krista started trying the application and was delighted. Finally she had a way to say what was on her mind!

Of course in the very beginning she didn't have much of a vocabulary. The primary idea was to let her learn how to use it. But every week or two we added more symbols or photos and extended her vocabulary.

By the end of April 2001 I posted the code of this first pVoice version on PerlMonks and set up a web page for people to download it if they could use it. The response was overwhelming. Everyone loved the idea and suggestions to improve the code or to add features came rolling in. Krista's therapists were also enthusiastic and asked for new features too.

Unfortunately the original pVoice was nothing more than a quick hack to get things going. It was not designed to add all the features people were asking for. So I decided I had to rewrite the whole thing.

This time it had to be a well-designed application. I wanted to use wxPerl for the GUI instead of the (in my eyes) ugly Motif look of Tk, I wanted to use a speech synthesizer instead of prerecorded .wav files, and most importantly, I wanted to make it easier to use. The original application was not easy to install and modifying the vocabulary was based on the idea you knew your way around in the operating system of your choice: you had to put files in the right directories yourself and modify text files by hand. For programmers this is an easy task, but for end users this turns out to be quite difficult.

pType Screenshot

It took me until the summer of 2002 before I started working on the next pVoice release. For almost a year I hadn't worked on it at all because of some things that happened in my personal life. Since Krista was learning to read and write and had no way of expressing what she could write herself, I decided not to start with rewriting pVoice immediately, but with building pType.

pType would allow her to select single letters on her screen to form words in a text entry field at the bottom of her screen and -- if desired -- to pronounce that for her. pType was my tryout for what pVoice 2.0 would come to be: it used wxPerl, Microsoft Agent for speech synthesis, and was more user-friendly. In October 2002, pType was ready and I could finally start working on pVoice 2.0. While copying and pasting lots of the code I wrote for pType, I set up pVoice to be as modular as possible. I also tried to make the design extensible, so I would be able to add features in the future -- even features I hadn't already thought of.

In March this year it finally was time to release pVoice 2.0. It was easy to install: it was compiled into a standalone executable using PerlApp and by using InnoSetup I created a nice looking installer for it. The application looked more attractive because I used wxPerl, which gives your application the look-and-feel of the operating system it runs on. It was user friendly because the user didn't have to modify any files to use the application: all modifications and additions to the vocabulary could be done within the application using easy-to-understand dialog windows. I was quite satisfied with the result, although I already knew I had some features to add in future releases.

The Present

pVoice animation

At this moment, rewriting the online help file is the last step before I can release pVoice 2.1. That version will have support for all Microsoft SAPI 4 compatible speech engines, better internationalization support, the possibility to have an unlimited depth of categories within categories (until pVoice 2.0 you had only one level of categories with words and sentences), the possibility to define the number of rows and columns with images yourself, and numerous small improvements. Almost all of these improvements and feature additions are suggested by people who tried pVoice 2.0 themselves. And that's great news, because it means that people who need this kind of software are discovering Open Source alternatives for the extremely expensive commercial applications.

Many people have asked me how many users pVoice has. That's a question I can't answer. How do you measure the use of Open Source software? Since Jan. 1, 2003, approximately 400 people have downloaded pVoice. On the other hand, the mailing lists have some 50 subscribers. How many people are actually using pVoice then? I couldn't say.

The Future

I'm hoping to achieve an increase in the number of users in the next 12 months. The Perl Foundation (TPF) has offered me one of its grants, to be used for promotion of pVoice. With the money I'll be travelling to OSCON next year and hope to speak there about pVoice. While I'm in Portland I'll try to get other speaking engagements in the area to try to convince people that they don't always need to spend so much money on commercial software for disabled people, but that there are alternatives like SueCenter and pVoice. Shortly after I heard about the TPF grant, I also heard that I'll be receiving a large donation from someone (who wishes to remain anonymous), that I can also use for promotion of pVoice or for other purposes like costs I might have to add features to pVoice.

Still, a lot can be improved on pVoice itself. I want to make it more useful for people with other disabilities than my daughter's, I would like to have more translations of the program (currently I have Dutch and English, and helpful people offered to translate it into German, Spanish, French, and Swedish already), I want to support more Text To Speech technologies than Microsoft's Speech API (like Festival), and I would like to find the time to make the pVoice platform independent again, because currently it only runs on Windows. I hope to write other pVoice- like programs like pHouse, which will be based upon efforts of the MisterHouse project, to be able to control appliances in and around the house, but the main thing I need for that is time. And with a full-time job, time is limited.

Maybe, after reading all of this, you'll think, "How can I help?". Well, there are several things you could do. First of all, if you know anyone who works with disabled people, tell them about pVoice. Apart from SueCenter, pVoice is the only Open Source project I know of in this area. Lots of people who need this kind of software can't get their insurance to pay for the software and would have to pay a lot of money. With pVoice they have a free alternative.

Of course, you could also help with the development. Since pVoice is not tied to any specific natural language, you could help by translating pVoice into your native tongue. Since the time I can spend on pVoice is limited, it would be nice to have more developers on pVoice in general. More information on pVoice is available from the web site.

Cooking with Perl

Editor's note: The new edition of Perl Cookbook is about to hit store shelves, so to trumpet its release, we offer some recipes--new to the second edition--for your sampling pleasure. This week's excerpts include recipes from Chapter 6 ("Pattern Matching") and Chapter 8 ("File Contents"). And be sure to check back here in the coming weeks for more new recipes on topics such as using SQL without a database server, extracting table data, templating with HTML::Mason, and more.

Sample Recipe: Matching Nested Patterns

Problem

You want to match a nested set of enclosing delimiters, such as the arguments to a function call.

Solution

Use match-time pattern interpolation, recursively:

my $np;
$np = qr{
           \(
           (?:
              (?> [^(  )]+ )    # Non-capture group w/o backtracking
            |
              (??{ $np })     # Group with matching parens
           )*
           \)
        }x;

Or use the Text::Balanced module's extract_bracketed function.

Discussion

The $(??{ CODE }) construct runs the code and interpolates the string that the code returns right back into the pattern. A simple, non-recursive example that matches palindromes demonstrates this:

if ($word =~ /^(\w+)\w?(??{reverse $1})$/ ) {
    print "$word is a palindrome.\n";
}

Consider a word like "reviver", which this pattern correctly reports as a palindrome. The $1 variable contains "rev" partway through the match. The optional word character following catches the "i". Then the code reverse $1 runs and produces "ver", and that result is interpolated into the pattern.

For matching something balanced, you need to recurse, which is a bit tricker. A compiled pattern that uses (??{ CODE }) can refer to itself. The pattern given in the Solution matches a set of nested parentheses, however deep they may go. Given the value of $np in that pattern, you could use it like this to match a function call:

$text = "myfunfun(1,(2*(3+4)),5)";
$funpat = qr/\w+$np/;   # $np as above
$text =~ /^$funpat$/;   # Matches!

You'll find many CPAN modules that help with matching (parsing) nested strings. The Regexp::Common module supplies canned patterns that match many of the tricker strings. For example:

use Regexp::Common;
$text = "myfunfun(1,(2*(3+4)),5)";
if ($text =~ /(\w+\s*$RE{balanced}{-parens=>'(  )'})/o) {
  print "Got function call: $1\n";
}

Other patterns provided by that module match numbers in various notations and quote-delimited strings:

$RE{num}{int}
$RE{num}{real}
$RE{num}{real}{'-base=2'}{'-sep=,'}{'-group=3'}
$RE{quoted}
$RE{delimited}{-delim=>'/'}

The standard (as of v5.8) Text::Balanced module provides a general solution to this problem.

use Text::Balanced qw/extract_bracketed/;
$text = "myfunfun(1,(2*(3+4)),5)";
if (($before, $found, $after)  = extract_bracketed($text, "(")) {
    print "answer is $found\n";
} else {
    print "FAILED\n";
}

See Also

The section on "Match-Time Pattern Interpolation" in Chapter 5, "Pattern Matching," of Programming Perl, 3rd Edition; the documentation for the Regexp::Common CPAN module and the standard Text::Balanced module.

Sample Recipe: Pretending a String Is a File

Problem

You have data in string, but would like to treat it as a file. For example, you have a subroutine that expects a filehandle as an argument, but you would like that subroutine to work directly on the data in your string instead. Additionally, you don't want to write the data to a temporary file.

Solution

Use the scalar I/O in Perl v5.8:

open($fh, "+<", \$string);   # read and write contents of $string

Discussion

Perl's I/O layers include support for input and output from a scalar. When you read a record with <$fh>, you are reading the next line from $string. When you write a record with print, you change $string. You can pass $fh to a function that expects a filehandle, and that subroutine need never know that it's really working with data in a string.

Perl respects the various access modes in open for strings, so you can specify that the strings be opened as read-only, with truncation, in append mode, and so on:

open($fh, "<",  \$string);   # read only
open($fh, ">",  \$string);   # write only, discard original contents
open($fh, "+>", \$string);   # read and write, discard original contents
open($fh, "+<", \$string);   # read and write, preserve original contents

These handles behave in all respects like regular filehandles, so all I/O functions work, such as seek, truncate, sysread, and friends.

See Also

The open function in perlfunc(1) and in Chapter 29 ("Functions") of Programming Perl, 3rd Edition; "Using Random-Access I/O;" and "Setting the Default I/O Layers"


O'Reilly & Associates will soon release (August 2003) Perl Cookbook, 2nd Edition.


This week on Perl 6, week ending 2003-08-17

Picture, if you will a sunny garden, unaffected by power cuts, floods, plagues of frog or any of the other troubles that assail us in this modern world. Picture, if you will, your summarizer, sat in this garden with a laptop on his knee, cursing the inability of LCD display manufacturers to make displays that are legible in sunlight. Picture your summarizer returning to the comfortable chair in the shade of his book room casting around for a witty and original way to open another Perl 6 summary. Picture him giving up and starting to type. Here's what he writes:

We start, as usual, with the the internals list.

Tail call optimization

Leo Tötsch has started to work on getting IMCC to detect tail calls and optimize them to either a simple jump or an invoke. If you're not sure what tail call optimization is I can recommend Dan's ``What the heck is a tail call?'' article on the subject.

http://groups.google.com/groups

http://www.sidhe.org/~dan/blog/archives/000211.html -- What the heck?

Leo's QUERIES from last week

Last week, Leo bundled up a bunch of outstanding questions for Dan, and Dan answered them. Benjamin Goldberg queried Dan's answer about find_method, which, if I'm reading things correctly is currently implemented in the interpreter. Benjamin argued (convincingly I thought) that, although the method hashes needed to be stored in the interpreter structure, find_method should be implemented on default.pmc, allowing for different classes/languages to override its behaviour.

http://groups.google.com/groups

Why ~ for xor?

Michal Wallace wanted to know why ~ maps to both unary bitwise-not and binary bitwise-xor in IMCC; he expected xor to be ^ and ^^. Leo explained that it was, at one point at least, the way Perl 6 did it but that he'd stopped keeping up with the developments in perl6-language. He noted that, if the Perl 6 operator had been settled, then IMCC should use that. A quick skim of my copy of Perl 6 Essentials tells me that Perl 6 now uses +&, +| and +^ for bitwise and/or/xor, with ~&, ~|, ~^ for stringwise and/or/xor.

http://groups.google.com/groups

http://pc1.bofhadsl.ftech.co.uk:8080/archives/000011.html -- My review of Perl 6 Essentials

Raising Hell

Michal Wallace's Py-Pirate project continues to exercise the edges of Parrot as he implements more and more of Python's semantics. This time he needed to know about raising (and catching) exceptions. In particular he wanted to be able to catch an exception when find_lex failed to find an appropriately named variable. (Currently, a find_lex failure doesn't raise an exception, it just kills Parrot). Jos Visser told him that there were plans to have find_lex throw a real, catchable exception, or maybe just return undef. See last week's summary for a pointer to that discussion.

http://groups.google.com/groups

Pirate status and help with instances

Michal Wallace announced that he had just finished an all night coding spree and that Py-pirate could now handle:

  • functions (closures, recursion, etc)
  • Global variables
  • tuples (but they turn into lists)
  • dictionaries (create, setitem, getitem, print)
  • list comprehensions
  • raise (strings only)
  • try...except (partial)
  • try...finally
  • assert

(At this point, I think we should all pause for moment of wild applause).

However, he was having a few problems with instantiating objects of a class. (For those who don't know, Python instances are created by calling the Class as if it were a function). Leo Tötsch noted that almost nothing would work, since Parrot's classes and objects weren't actually finished yet. He agreed with others in the thread who reckoned that Python classes could be made to work by subclassing the standard class.pmc to allow it to respond to an invoke by by creating a new instance. Easy! Michal muttered something about faking classes with closures but I don't think he went through with it.

http://groups.google.com/groups

Packfile fun

So long assemble.pl, it's been good to know you.

http://groups.google.com/groups

Approaching m4

Leon Brocard, Sean O'Rourke and James Michael DuPont looked on in awe as Bernhard Schmalhofer announced that he'd been working on implementing m4 in Parrot. According to Sean, the ``implications are staggering... Sure, plenty of compilers can bootstrap themselves, but how many can generate their own configure scripts via autoconf? With p4rrot, we may live to see this dream.''

One does worry about Sean's dream life.

http://groups.google.com/groups

Call and return conventions

TOGoS has been thinking about the workings of the Parrot calling conventions. He wondered if there was a case for making calling and returning look exactly the same, allowing for cunning stunts with fake continuations in P1. Luke Palmer really liked the idea. Leo seemed to think it was a good idea too. There has been no word from Dan yet.

http://groups.google.com/groups

Packfile and EXEC

Leo Tötsch has started to extend the packfile functions to handle multiple code segments and has been running into problems with EXEC -- the tool that uses the JIT to generate a native executable from Parrot assembly. Daniel Grunblatt checked in a 'temporary' fix, which at least solved Leo's immediate problem.

http://groups.google.com/groups

Parrot 0.1.0 -- what's left?

Steve Fink thought, given the 'insane amount of work' on Parrot recently, that it was approaching time to cut another release. He asked if there was anything else coming up that people would like to see included in the release, and whether we had enough to call the next release 0.1.0. Luke Palmer really wanted to see objects finished, but wasn't sure how much would that would entail. Leo thought that the Parrot Calling Conventions (and more particularly the return convention) support needed fixing, and noted that PackFile is currently in a state of flux. Both of them thought 0.1.0 would be the right version number though.

http://groups.google.com/groups

set vs. assign continues to add vs add!

TOGoS wasn't keen on the variable behaviour of add depending on whether its target was a PMC, or an integer/number register. Brent Dax thought that TOGoS needed to train his expectations and went on to explain why.

For reasons that I can't quite follow, this discussion morphed into a polite argument between Dan and Leo about the wisdom of having all those keyed opcodes.

Benjamin Goldberg pointed out that, since TOGoS's desired 'set' semantics could easily mocked up with an assign operator (but not vice versa), then maybe IMCC could handle mocking it up automatically.

http://groups.google.com/groups

Serializing functions

Michal Wallace wondered if it is possible to serialize a Parrot function, or (slightly more tricky) if you could serialize a generator and its state. Answer: Yes, it's almost done (well, the function part anyway; Leo didn't' mention the generator thing).

I'm not sure if the support for this was fixed up in the packfile changes that Leo checked in a couple of hours after his answer.

http://groups.google.com/groups

Parrot and STDOUT/STDERR

Arthur Bergman popped over from ponie-dev to ask about a problem he'd been having with his parrot embedded miniperl. Apparently, if a caller closes STDERR, the program produces no output on STDOUT either. Leo found where the problem was happening -- Parrot was trying to open STDERR, failing, and dying with an error message. To STDERR. Jürgen Bömmels, the ParrotIO maintainer, outlined a few possible fixes.

http://groups.google.com/groups

Calling Parrot from C

Luke Palmer asked about calling a parrot sub from C and getting the return value. Leo gave a terse answer that covered the bases and pointed at classes/Eval.pmc for an implementation of something very like the general case of calling parrot subs from C.

http://groups.google.com/groups

There's no undef!

Michal Wallace discovered that there seems to be no op to remove a variable from a lexical pad. Leo patched the scratchpad PMC so that you can now do:

    peek_pad Pn
    delete Pn["foo"]

to delete variable names.

http://groups.google.com/groups

Pirate 0.01 ALPHA!

Michal Wallace announced the version 0.01 alpha of Pirate which he described as ``(almost) everything I can do for Python without jumping into C''. Which turns out to be an awful lot of stuff.

I've just looked back through my summary archive and, as near as I can tell, Michal's gone from thinking about doing this to having most of the Python syntax working in just under 3 weeks, which is really rather scary when you think about it. Well done Michal.

http://groups.google.com/groups

http://pirate.tangentcode.com

Timely destruction: An efficient, complete solution

Right at the end of the week, Luke Palmer posted another attempt to come up with a neat solution to the timely destruction problem. I'm guessing we'll see it discussed in next week's summary.

http://groups.google.com/groups

POW (Parrot on Win32) available

A while ago now, Jonathan Worthington offered to start making regular binary builds of Parrot for those who use Win32 but who can't compile Parrot on it. This week he announced that he's done it. Thanks Jonathan.

http://groups.google.com/groups

http://www.jwcs.net/developers/perl/pow/ -- The POW site

Implementing Nickle

Simon Cozens pointed the list at the nickle programming language and wondered if it might be a suitable language to implement on Parrot.

http://groups.google.com/groups

http://nickle.org/implement/html -- The Nickle site

Meanwhile, in perl6-language

Traffic was light. Very light.

Apocalypses and Exegeses

Alberto Manuel Brandão Simões, noting that the Apocalypses and Exegeses were subject to later modification, wondered if anyone had any idea when we'd have a freeze on the syntax and features for Perl 6. The obvious joke -- ``Sometime after Perl 5's syntax and features freeze'' -- was cracked. The consensus was that the syntax is currently 'slushy', and will probably firm up in the next 6-12 months. There was also some discussion of Perl 6 Essentials, and Nicholas Clark pointed us all at a new book that Alan Burlison had found.

http://groups.google.com/groups

http://bleaklow.com/blog/archive/000018.html

Acknowledgements, Announcements and Apologies

If anyone's following the 'moving to the North East' saga, last week's offer fell through, but that was actually a good thing as we're now buying my daughter's Tyneside flat instead, which does rather take some of the 'got to get everything sold by mid September' pressure off and replaces it with 'How on earth are we going to fit the contents of a 4 bedroomed house into a two bedroomed flat?' pressure.

In case you hadn't already spotted that I was impressed, much kudos goes to Michal Wallace for his sterling work on Pirate. Three weeks from concept to having a huge chunk of the language implemented is just amazing.

Check out http://pc1.bofhadsl.ftech.co.uk:8080/ for more of my writing.

As ever, if you've appreciated this summary, please consider one or more of the following options:

Perl Design Patterns, Part 3

This is the third (and final) article in a series which form one Perl programmer's response to the book Design Patterns (also known as the Gang of Four book or simply as GoF, because four authors wrote it). As I showed in the second article, Perl provides the types needed to implement many patterns. The Strategy and Template Method patterns can be implemented with code references. Builder usually builds a structure based on references to some combination of hashes and lists. Interpreters can be implemented with simple tools like split or with the king: Parse::RecDescent, which brings the best of yacc into your Perl script (albeit with somewhat less efficiency than yacc).

This article continues my treatment by considering patterns which rely on objects. As such, this article's patterns bears the most resemblance to the GoF book. Before presenting some patterns, I'll give you my two cents about object applicability.

When Are Objects Good?

As Larry Wall reminds us about all programming constructs, you should use objects when they make sense and not when they don't. So when do they make sense? This is partly a matter of taste. This subsection gives you my tastes.

It's easier to say when objects are bad, which they are in these cases:

More on Perl Design Patterns:

Perl Design Patterns, Part 1
Perl Design Patterns, Part 2

  1. There is only data, the methods are either trivial or non-existent. Data containers (also called nodes) are like this. For example, I should not need an object to return three numbers and a string to my caller.
  2. There are only methods. The Java Math class is like this. It won't even let you make a Math object. Clearly its methods should just be built-in functions of the language.

Seeing the poor uses of objects gives insight into their effective use. Use objects when complexity is high and data is tightly coupled to the methods which act on it. High complexity makes these chief advantages of objects more important: separate namespaces, inheritance, and polymorphism.

Now that I've spoken my peace, I'll go on to the patterns which use objects.

Abstract Factory

If you want to build platform independent programs, you need a way to access the underlying systems without having to recode for each one's API. This is where a factory comes into play. The source code asks for an instance of a class, the class delivers a subclass instance suitable for use on the current platform. That class is called an abstract factory (or simply a factory). As we will see below, the platform might be a database. So the factory would return an object suitable for use with a particular database, but all the objects would have the same API.

To show the basic idea, here is an example which delivers one of two types. There are four code files in this example. The first two are the greeters.


package Greet::Repeat;

sub new {
    my $class    = shift;
    my $self     = {
        greeting => shift,
        repeat   => shift,
    };
    return bless $self, $class;
}

sub greet {
    my $self = shift;
    print ($self->{greeting} x $self->{repeat});
}

1;

This greeter's constructor expects a greeting and a repeat count. It stores these in a hash, returning a blessed reference to it. When asked to greet, it prints the greeting repeatedly (hence the name). (I didn't say this example was practical, but it is small.)


    package Greet::Stamp;
    use strict; use warnings;

    sub new {
        my $class    = shift;
        my $greeting = shift;
        return bless \$greeting, $class;
    }

    sub greet {
        my $greeting = shift;
        my $stamp    = localtime();
        print "$stamp $$greeting";
    }

    1;

This greeter only expects a greeting string, so it blesses a reference to the one it receives. When asked to greet, it prints the current time followed by the greeting.

Here's the factory:


    package GreetFactory;
    use strict; use warnings;

    sub instantiate {
        my $class          = shift;
        my $requested_type = shift;
        my $location       = "Greet/$requested_type.pm";
        my $class          = "Greet::$requested_type";

        require $location;

        return $class->new(@_);
    }

    1;

A Perl factory looks a lot like factories in other languages. This one has only one method. It returns the requested type to the caller. It uses the caller's requested type as the name of the class to instantiate and as the name of the Perl module in which the class lives.

Finally, you can use this factory with a script like this:


    #!/usr/bin/perl
    use strict; use warnings;

    use GreetFactory;

    my $greeter_n = GreetFactory->instantiate("Repeat", "Hello\n", 3);
    $greeter_n->greet();

    my $greeter_stamp = GreetFactory->instantiate("Stamp", "Good-bye\n");
    $greeter_stamp->greet();

To make each greeter, call the instantiate method of GreetFactory, passing it the name of the class you want and any arguments that class's constructor is expecting.

This example shows you the basic idea. It is simple on purpose. But it does show how the factory can be ignorant of the underlying classes. Any new greeter added to the system must have a name of the form Greet::Name and be placed into a Greet subdirectory of an @INC path member as Name.pm. Then callers can use it without changing the factory. Now that you have seen a simple example, here is a more useful one.

The Perl DBI (DataBase Interface) provides an excellent example of a factory. Each call to DBI-connect>, expects the type of database and whatever information that database needs to establish a connection. This is a classic factory. It will load any DBD (DataBase Driver) you have installed on your system, upon request. Additional DBD's can be added at any time. Once they are installed, any client can use them through the same DBI API. Here's an example use of DBI:


    use DBI;
    my $dbh      =
        DBI->connect("dbi:mysql:mydb:localhost", "user", "password");
    ...
    my $sth      = $dbh->prepare('select * from table');
    ...

Once the database handle is obtained (which is usually called $dbh), it can be used almost without regard to the underlying engine. If you later move to Oracle, you would merely change the connect call. If a new database comes on the scene, some smart person in contact with Tim Bunce will implement a class for it. You can install and switch to it as soon as they finish their work. You might even be the implementer, but I doubt I will be.

Composite

This pattern shows how to use the fully OO composite pattern. If you are interested in a simpler non-OO implementation see the Builder Pattern in Part 2 of this article series.

Many applications require hierarchies of related items linked into a tree by relationship. Many people see a hierarchy of this type: a directory structure. At the top is the root directory. In the simplest case it includes two types of items: files and subdirectories. Each subdirectory is like the root directory. Note that this definition of the structure is recursive, which is typical of composites.

One of the most popular examples of a composite structure today is an XML file. These files have a root element which contains various types of subelements, including tags and comments. Tags have attributes and some can contain subelements. This makes the classic composite tree. There are two important steps for a composite structure. The first is building it. The second is using it. We'll see simple examples of both here.

For the genuine pattern, there must be methods that act on both regular and composite elements (the elements with children are called composite elements). Invoking such a method on the root of a composite tree, or subtree, causes that root to do work on its own data AND to forward the request to its children. Those children do the same, collecting their own data and that of their children, until the bottom of the tree is reached. The return value is a collection of all this data.

For a practical example consider using the DOM model to process XML. (You may obtain the XML::DOM module from CPAN.) To find all the paragraphs in a document we could do something like this:


    use XML::DOM;
    my $parser = XML::DOM::Parser->new();
    my $doc    = $parser->parsefile("file.xml");
    foreach my $paragraph ($doc->getElementsByTagName("paragraph")) {
        print "<p>";
        foreach my $child ($paragraph->getChildNodes) {
            print $child->getNodeValue if ($child->getNodeType eq TEXT_NODE);
        }
    }
    $doc->dispose();

The call to getElementsByTagName begins at the root (since I called it through $doc). The root returns any of its children which are paragraphs, but it also forwards the request to all of its tag elements asking them to return their paragraphs. They do the same.

An unrelated note: Notice that the above example ends with a call to dispose. XML::DOM composite structures have references from parents to children and from children to parents. We usually call these circular links. Perl 5 garbage collection cannot harvest such structures. We must call dispose to break the extra links so the structure's memory can be recovered. If you build structures with circular links, you must break those links yourself, otherwise your program will leak memory.

We've seen how useful a well crafted composite structure can be, but how could we build one for ourselves? The objects in the structure must all respond to the methods meant to walk the composite. They may return undef immediately, but they must exist. Further, the version of those methods in the composite objects (the ones which can have children), must take care to pass the message along to their children.

To make this concrete, consider a non-binary tree (as we have been all along). Suppose we want to know how many nodes are in the tree. We can ask the root to count_nodes. It should count itself and add that to the sum of count_nodes calls to each child. Nodes which are not composite (i.e. have no children) return one. Composite nodes, return one plus the sums from their children. The code follows.

There are four pieces of code: (1) A base class for tree nodes: Node.pm, (2) A class for nodes that could have children: Composite.pm, (3) A class for nodes that can't have children: Regular.pm, and (4) a driver to demonstrate that the system works: comp. I'll show these one at a time, in the order listed above.


    package Node;
    use strict; use warnings;

    sub count_nodes {
        my $self       = shift;
        my $class_name = ref $self;
        die "$class_name does not implement count_nodes\n";
    }

    1;

The only method here is count_nodes. This serves as an implementation requirement (also called an abstract method). Attempting to use a Node subclass which doesn't provide count_nodes results in a fatal run-time error. Every subclass should have an appropriate test to make sure this error never happens to users.


    package Regular;

    use Node;
    @ISA = qw(Node);

    use strict; use warnings;

    sub new {
        my $class = shift;
        my $name  = shift;
        return bless \$name, $class;
    }

    sub count_nodes {
        return 1;
    }

    1;

Regular nodes are blessed references to their names. They always count as a single node. (An unrelated note: it is sometimes convenient to turn on strict after the preamble of a package, here that let's me use @ISA without qualifying it.)


    package Composite;

    use Node;
    @ISA = qw(Node);

    use strict; use warnings;

    sub new {
        my $class = shift;
        my $name  = shift;
        my $self  = { name => $name, children => [] };
        return bless $self, $class;
    }

    sub add_child {
        my $self      = shift;
        my $new_child = shift;

        push @{$self->{children}}, $new_child;
        return $new_child;
    }

    sub count_nodes {
        my $self  = shift;
        my $count = 1;

        foreach my $child (@{$self->{children}}) {
            $count += $child->count_nodes();
        }
        return $count;
    }

    1;

This class is similar to Regular, but it needs a way to keep track of children. Since it also keeps its name, I used a hash for the object type. New children are just pushed onto a list. Counting includes one for the parent node, plus the total for each child. Since leaves of the tree also implement count_nodes, we can process all Node types together. This is the polymorphism advantage of objects and the heart of the Composite Pattern.


    #!/usr/bin/perl
    use strict; use warnings;

    use Composite;
    use Regular;

    my $root     = Composite->new("Root");

    my $eldest   = $root->add_child(Composite->new("Jim"));
    my $middle   = $root->add_child(Composite->new("Jane"));
                   $root->add_child(Regular->new("Bob"));
    my $youngest = $root->add_child(Composite->new("Joe"));

                   $eldest->add_child(Regular->new("JII"));
    my $kayla    = $eldest->add_child(Composite->new("Kayla"));
                   $kayla->add_child(Regular->new("Max"));

    my $count = $root->count_nodes();

    print "count: $count\n";

This contrived example manually builds a simple tree, then asks for a node count. The correct answer is 8.

Proxy

In GoF the proxy pattern example shows a way to delay loading expensive components until the user actually wants them. In the course of the example they show a genuine proxy. Proxies refer all requests to some other object. Think of it like an intermediary for the mob. You make your request to your local thug, as if he could do the thing himself. He passes that on to someone else you never meet who actually does the job. (Note to John Ashcroft: I am only imaging this process, having NO personal experience with it. Honest.)

Suppose an application could use several large files, but usually only needs one or two. Instead of reading all these files, I will delay loading the file until the caller wants to see it. The usual warning applies: this is contrived to explain the concept.

Here is the class that actually stores and prints the files:


    package File;
    use strict; use warnings;

    sub new {
        my $class = shift;
        my $file  = shift;
        open FILE, "$file" or die "Couldn't read $file: $!\n";
        my @data  = <FILE>;
        close FILE;
        return bless \@data, $class;
    }

    sub print_file {
        my $data = shift;
        print @$data;
    }

    sub DESTROY { }

    1;

When the File constructor is called, it reads the file into an array for later use, returning a blessed reference to the data to the caller. When asked to print, it sends the data to the currently selected output handle (usually standard out).

The DESTROY subroutine is called by Perl whenever a blessed reference is about to go out of scope. This allows us to perform clean-up which is guaranteed to happen. In this case, there is no necessary clean-up, but the approach I'm about to show for the proxy class ends up calling this method explicitly. That explicit call offends Perl so much that it complains to the screen. To avoid the warning, I included the stub.

There is nothing special about the File class shown above. The proxy follows.


    package FileProxy;
    use strict; use warnings;

    use File;

    sub new {
        my $class = shift;
        my $self  = {
            params         => \@_,
            wrapped_object => undef,
        };
        return bless $self, $class;
    }

    sub AUTOLOAD {
        my $self    = shift;
        my $command = our $AUTOLOAD;
        $command    =~ s/.*://;

        unless (defined $self->{wrapped_object}) {
            $self->{wrapped_object} = File->new(@{$self->{params}});
        }
        $self->{wrapped_object}->$command(@_);
    }

    1;

The constructor for the proxy takes the things necessary to build an actual File object (namely the file name) and stores them as its params attribute. The other attribute will eventually hold the wrapped File object. The attributes are stored in hash, the hash's reference is blessed and returned to the caller.

Whenever Perl has no where else to go with a method call, it calls AUTOLOAD (if there is one). So, the AUTOLOAD in FileProxy handles all requests except new and DESTROY, which appear explicitly. AUTOLOAD is all caps to remind us that Perl calls it for us. While making this call, Perl sets the package global variable $AUTOLOAD to the name of the method the caller invoked. The regular expression strips off the package names from $AUTOLOAD, leaving only the method name.

If the object is not yet defined, AUTOLOAD calls File->new passing it the arguments stored during construction. After that, the object is defined, so AUTOLOAD calls the requested method on the wrapped object. The beauty of this mechanism is that the FileProxy class only knows that the constructor is called new. It does not need to change as changes to File.pm are made. Any errors, such as no such method, will be fatal as usual.

To use this proxied scheme we might employ a caller like this:


    #!/usr/bin/perl
    use strict; use warnings;

    use FileProxy;

    my $file1 = FileProxy->new("art1");
    my $file2 = FileProxy->new("art2");

    $file1->print_file();
    $file1->print_file();
    $file2->print_file();

With a couple of changes we could use this for any class. Here's the new generic version:


    package DelayLoad;
    use strict; use warnings;

    our %proxied_classes;

    sub import {
        shift;  # discard class name

        %proxied_classes = @_;

        foreach my $class (keys %proxied_classes) {
            require "$class.pm";
        }
    }

    sub new {
        my $class = shift;
        my $self  = {
            type           => shift,
            constructor    => shift,
            params         => \@_,
            wrapped_object => undef,
        };
        return bless $self, $class;
    }

    sub AUTOLOAD {
        my $self    = shift;
        my $command = our $AUTOLOAD;
        $command    =~ s/.*://;

        if ($proxied_classes{$command}) {
            return $self->new($command, $proxied_classes{$command}, @_);
        }
        else {
            unless (defined $self->{wrapped_object}) {
                my $proxied_class       = $self->{type};
                my $constructor         = $self->{constructor};
                $self->{wrapped_object} = $proxied_class
                                        ->$constructor(@{$self->{params}});
            }
            $self->{wrapped_object}->$command(@_);
        }
    }

    1;

The first change is cosmetic: the name now reflects the nature of the proxy. Other changes include a new method: import. Even though its name is lower case, Perl calls it whenever the caller says use DelayLoad (see below). It does two things. First, it stores the name of each proxied class in the %proxied_classes package global. Second, it requires each module. require is like use, but it happens at run time instead of compile time. (use also imports symbols, but then your object oriented module shouldn't be exporting anything anyway.)

The constructor now stores a bit more information. In addition to saving room for the wrapped object and storing the params, it also records the name of the class and of that class's constructor. These will be used in AUTOLOAD.

The only other changes are in the AUTOLOAD method. There are two changes. The easiest one is to look up the class and constructor names in the DelayLoad object instead of just calling File->new.

The other change is used during construction. My explanation of it will make more sense, if you see the new caller first.

The new version requires a couple of changes to the caller. One change is on the use line which becomes:


    use DelayLoad "File" => "new";

This uses DelayLoad, tells it we want to be able to delay loads for File objects, and that File's constructor is called new.

The other change is in how we construct the delayed object:


    my $file1 = DelayLoad->File("art1");
    my $file2 = DelayLoad->File("art2");

This explains the unexplained piece in AUTOLOAD above. When the user calls the File method, AUTOLOAD notices that this ``method'' is really the name of a delay loaded class. When the if in AUTOLOAD is true (i.e. the method is really a key in %proxied_classes), the caller is given a new DelayLoad object primed for later use. When the if fails, DelayLoad works like FileLoad: it constructs the object, if needed and calls the requested method.

The fundamental point of this example is that Perl allows us to implement proxies without knowing very much about the underlying class. In this case, import receives the necessary information from the caller, AUTOLOAD takes care of the rest. Making the caller work is not always a good idea. Here it makes sense. If she knows she wants to delay loading objects until they are really needed, she must at least know the API for those objects. In the API is the name of the constructor, which she mentions in the use statement so Perl can pass to DelayLoad::import for her.

Keep in mind that AUTOLOAD is not designed for this sort of work. Its real purpose in life is to load subroutines on demand for the current package. It can't do that here, since changing the subroutines affects all instances of a class. Here we are AUTOLOADing data, not routines. By suitably adjusting import and AUTOLOAD, you can make the proxy do many other things.

Summary

In this article, I have finally shown object oriented patterns. We saw how to implement a Factory so our callers can choose their favorite driver, how to build composite structures and routines that traverse them (without explicit first_child and next pointers that would be needed in languages without quality built-in lists), and how to stand as a proxy between a caller and a class with import and AUTOLOAD.

Author's Note

This is the final article in this series, but look for a book, Design Patterns in Perl from Apress at your favorite bookseller in the near future.

Perl Design Patterns, Part 2

This is the second in a series of articles which form one Perl programmer's response to the book, Design Patterns (also known as the Gang of Four book or simply as GoF, because four authors wrote it).

As I showed in the first article, Perl provides the best patterns in its core and many others are modules which ship with Perl or are available from CPAN. There I considered Iterator (foreach), Decorator (pipes and list filters), Flyweight (Memoize.pm), and Singleton (bless an object in a BEGIN block).

People into patterns often talk about how knowing patterns makes describing designs easier. The parenthetical comments in the last sentence show how Perl takes this to new heights by including the patterns internally.

This article continues my treatment by considering patterns which rely on data containers and/or code references (which are also called callbacks). Before showing the patterns let me explain these terms.

Data Containers

I use data containers to mean any reference that holds a data structure. Arrays and hashes are common data containers, but hashes of lists of hashes storing things are more interesting. Careful use of these structure containers can often eliminate the need for objects.

Here's a concrete example. Suppose I want a phone list. I might use a container like this:


    my $phone_list = {
        'phil' => [
            { type => 'home',  number => '555-0001' },
            { type => 'pager', number => '555-1000' },
        ],
        'frank' => [
            { type => 'cell',  number => '555-9012' },
            { type => 'pager', number => '555-5678' },
            { type => 'home',  number => '555-1234' },
        ],
    };

This container is housed in a hash. Its keys are names; its values are phone numbers. The numbers are listed in the order the person would like them used. (Call Frank on his cell phone first, then try his pager. If all else fails, use his home phone.)

To use this structure I might do the following:


    #!/usr/bin/perl
    use strict; use warnings;

    my $phone_list = {
        'Phil' => [
            { type => 'home',  number => '555-0001' },
            { type => 'pager', number => '555-1000' },
        ],
        'Frank' => [
            { type => 'cell',  number => '555-9012' },
            { type => 'pager', number => '555-5678' },
            { type => 'home',  number => '555-1234' },
        ],
    };

    my $person = shift or die "usage: $0 person\n";

    foreach my $number (@{$phone_list->{$person}}) {
        print "$number->{type} $number->{number}\n";
    }

Here, the user supplies the name of the person he or she wants to reach as a command line argument which I store as $person. I then loop through all the phone numbers for that person, printing the type and number.

Of course, in practice your data would live outside your script. The example just shows what one data container can hold.

If you need to use a structure made of data nodes, you can often avoid the need for a node object by using a data container instead. Object Oriented programming proponents would probably want me to make an object for each person. In that object they might even want me to store an object for each phone type in some cumbersome list container. My advice: don't give in to pedants. Even in Java, I can build a structure like the one above (though not as easily). Doing so is often wise. Objects work better in more complex situations.

What's a Code Reference?

A code reference is like any reference in Perl, but what it points to is a subroutine you can call. For instance, I could write:


    my $doubler = sub { return 2 * $_[0]; };

Then later in my program I would call that routine as:


    my $doubled = &$doubler(5);  # $doubled is now 10

This example is contrived. But it lets you see the basic syntax of code references. If you assign a sub to a variable, you receive a code reference by the grace of Perl. To call the sub stored in the reference put an & in front of the variable which stores it. This is like we do for other references, as in this standard hash walker:


    foreach my $key (keys %$hash_reference) { ... }

The & is the sigil (or funny character) for subroutines, just like @ and % are the sigils for arrays and hashes.

Many patterns in GoF and outside it can be implemented well in Perl with code references. Languages which don't provide code references are missing an important type.

Having explained these tools, I'm ready to show you some patterns which use them.

Strategy

When you want to select from a series of choices for how something should be done, you need a strategy scheme. For example, you might want to sort based on a comparison function. Each time you sort, you should be able to specify the order strategy.

Since Perl has code references, we can easily implement the strategy pattern without bloating our code base with a proliferation of classes whose sole purpose is to provide one function.

Here's an example with the built-in sort:


    sort { lc($a) cmp lc($b) } @items

This sorts without regard to case. Notice how sort is receiving the function directly in the call. Though we could do this for our own functions, it is more common to take a reference to the function as a required positional parameter.

Suppose, for example, that we want to list all files in the current directory, or any of its subdirectories, with some property. There are two pieces to this task: (1) Scan down the directory tree for all the entries, and (2) Test each file to see if it meets the criterion. Ideally we would like to separate these tasks so we can reuse them independently (for instance scanning a directory tree is more common than any particular criterion). We will make the criterion a strategy executed by the directory scanner.


    #!/usr/bin/perl
    use strict; use warnings;

    my @files = find_files(\&is_hidden, ".");
    local $" = "\n";
    print "@files\n";

    sub is_hidden {
        my $file = shift;
        $file    =~ s!.*/!!;
        return 0 if ($file =~ /^\./);
        return 1;
    }

    sub find_files {
        my $callback = shift;
        my $path     = shift;
        my @retval;

        push   @retval, $path if &$callback($path);
        return @retval unless (-d $path);

        # identify children
        opendir DIR, $path or return;
        my @files = readdir DIR;
        closedir DIR;

        # visit each child
        foreach my $file (@files) {
            next if ($file =~ /^\.\.?$/);  # skip . and ..
            push @retval, find_files("$path/$file", $callback);
        }

        return @retval;
    }

To understand this example, start with the initial call to find_files. It passes two arguments. The first is a code reference. Note the syntax. As I pointed out in the introduction, to let Perl know I mean a subroutine, I put the & sigil in front of is_hidden. To make a reference to that routine (instead of calling it immediately), I put a backslash in front, just as I would to take any other kind of reference.

When I use the callback in find_files, $callback has the reference to the code. To dereference it I put the & sigil in front of it.

The find_files subroutine takes a path where the search begins and a code reference called $callback. At each invocation, it stores the path in the return list, if callback returns true for that path. This allows you to reuse find_files for many applications, changing only the callback subroutine to change the outcome. This is the strategy pattern, but without the hassle of subclassing the find_files abstract base class and overriding the criterion method.

In find_files, I use recursion to descend the directory tree and its subtrees. First, I call the callback to see if the current path should go into the output. Then the real routine begins. What the callback does makes no difference to this routine. Any true or false value is OK with find_files.

The recursion stops if the file is not a directory. At that point the list is immediately returned. (It could be empty or have the current path in it, depending on the callback's return value.) Otherwise, all the files and subdirectories in the current path are read into @files. Each of those entries is scanned by the recursive call to find_files (unless the file is . or .., which would create endless recursion). Whatever the recursive call to find_files returns, it is pushed onto the end of the final output. When all children have been visited, @result is returned to the caller.

The CPAN module File::Find robustly solves the problem approached quickly in my example above. It relies on exactly this kind of function callback.

The Strategy Pattern uses a callback to perform a single task that varies from use to use. The next pattern uses a series of callbacks to implement the steps of an algorithm.

Template Method

In some calculations the steps are known, but what the steps do is not. For example, computing charges for a rental might involve three steps:

  1. Calculate amount due from rates.
  2. Calculate taxes.
  3. Add these together.

Yet different rentals might have different schemes for calculating the amount due from rates, and different jurisdictions usually have different tax schemes. A template method can implement the outline, deferring to callers for the individual schemes.


    package Calc;
    use strict; use warnings;

    sub calculate {
        my $class     = shift;   # discarded
        my $data      = shift;
        my $rate_calc = shift;   # a code ref
        my $tax_calc  = shift;   # also a code ref

        my $rate      = &$rate_calc($data);
        my $taxes     = &$tax_calc($data, $rate);
        my $answer    = $rate + $taxes;
    }

Here the caller supplies a data reference (probably to a hash or object) together with two code references which are used as callbacks. Each callback must expect the data reference as its first parameter. The tax_calc code reference also receives the amount due from the rate calculator. This allows it to use a percentage of the amount together with information in the data reference.

A caller might look like this:


    #!/usr/bin/perl
    use strict; use warnings;

    use Calc;

    my $rental = {
        days_used    => 5,
        day_rate     => 19.95,
        tax_rate     => .13,
    };

    my $amount_owed = Calc->calculate($rental, \&rate_calc, \&taxes);
    print "You owe $amount_owed\n";

    sub rate_calc {
        my $data = shift;
        return $data->{days_used} * $data->{day_rate};
    }

    sub taxes {
        my $data     = shift;  # discarded
        my $subtotal = shift;

        return $data->{tax_rate} * $subtotal;
    }

I made this contrived caller so you can see the calling sequence. The data here is a simple hash. To save exporting from Calc, I made calculate a class method, so I call it through its class. In the call, I pass a reference to my data hash and references to the two calculation routines.

This can be made more complex if you like. One could even make a full-blown class hierarchy of calculators, allowing callers to select the one they want. This example is about as simple as I could make the template method pattern.

Another approach to templates is to have the caller place methods in the template package. This approach amounts to an implementation of mixins a la Ruby. Here's a sample that is more object oriented.


    package Calc;

    sub calculate {
        my $self = shift;
        my $rate = $self->calculate_rate();
        my $tax  = $self->calculate_tax($rate);
        return $rate + $tax;
    }

    1;

The whole module is really only the template method. To use it, you have to code calculate_rate and calculate_tax methods, or your script will die. Here's a particular implementation of the scheme:


    package CalcDaily;
    package Calc;
    use strict; use warnings;

    sub new {
        my $class = shift;
        my $self  = {
            days_used    => shift,
            day_rate     => shift,
            tax_rate     => shift,
        };
        return bless $self, $class;
    }

    sub calculate_rate {
        my $data = shift;
        return $data->{days_used} * $data->{day_rate};
    }

    sub calculate_tax {
        my $data     = shift;  # discarded
        my $subtotal = shift;

        return $data->{tax_rate} * $subtotal;
    }

    1;

Note that I added a constructor and two methods to the Calc package in a different source file. This is perfectly legal and occasionally useful. By doing this, the template is totally isolated. It doesn't even know what sort of data will be stored in the objects of its own type. That does mean that only one Calc subtype can be used at a time. If that's a problem for you, do the standard thing: have Calc call methods on objects in some separate hierarchy.

There are two package statements at the top of the file, this is on purpose. The first one tells people (and crawlers) that this is the CalcDaily package which rightfully belongs in CalcDaily.pm, not the original Calc, which belongs in Calc.pm.

Finally, here's the caller, which is only slightly modified:


    #!/usr/bin/perl
    use strict; use warnings;
    use Calc;
    use CalcDaily;

    my $rental      = Calc->new(5, 19.95, .13);
    my $amount_owed = $rental->calculate();
    print "You owe $amount_owed\n";

This technique is similar to the one used in the debugger architecture for Perl. To make my own debugger, I need a name for it. I might choose PhilDebug.pm. Then I have to make a file with that name in a Devel directory which is in my @INC list. The first line in the file should be (but doesn't have to be):


    package Devel::PhilDebug;

This allows the CPAN indexer to properly catalog my module.

The base package for debuggers is fixed as DB. Perl expects to call the DB function in that package. So all together it might look something like this:


    package Devel::PhilDebug;
    package DB;

    sub DB {
        my @info = caller(0);
        print "@info\n";
    }

    1;

Any script will use this debugger if it is invoked as:


    perl -d:PhilDebug script

Each time the debugger notices that a new statement is about to start, it first calls DB::DB. This is a very powerful example of plug-and-play.

It is not usually wise to pollute foreign classes with your own code. Yet, Perl permits this, because it is sometimes highly useful. There seems to be a theme here:

Don't rule out dangerous things. Just avoid them, unless you have a good reason to use them.

The Strategy and Template patterns use code references to allow the caller to adjust the behavior of an algorithm. The template I showed used a data container to hold rental information. The next pattern makes more use of data containers.

Builder

Many structures external to your program should be represented with composites (like trees or the data container in the introduction) inside your program. There are two fundamentally different ways to represent these structures. For an object-oriented way to compose such structures see the Composite Pattern in GoF (which I will discuss in my next article).

Here we'll look at how to build a composite structure in a hash of hashes. You might rather build the objected-oriented version. Which you choose should depend on the complexity of the data and the methods to act on it. If data and methods are simple, you should probably use the hash structure. It will be faster, have built-in support, and be more familiar to Perl programmers who might need to maintain your code. If the complexities are large, you should use full-blown objects. They make the structure easier to understand for object-oriented programmers and provide more code-based documentation than simple hashes.

So, hashes are superior structures for simple to moderately complex data. To see how to build a hash structure consider an example: visualizing an outline. For simplicity, I'll represent the outline purely through indentation (not with Roman or other numerals). Here's an example outline:


    Grocery Store
        Milk
        Juice
        Butcher
            Thin sliced ham
            Chuck roast
        Cheese
    Cleaners
    Home Center
        Door
        Lock
        Shims

This outline describes a theoretical shopping trip. I want to represent it internally in my program so I can play with it. (One of my favorite games is turning outlines into pictures, see below.)

Instead of a full-blown object, I'll use a little hash-based data container for each node in the tree. Each node will keep track of three things:

  1. Name
  2. Level
  3. Children (a list of other nodes)

To keep track of who is a child of whom, I'll use a stack of these nodes. The node on the top of the stack is usually the parent of the next line of input. To show my method, I'll intersperse comments with the script. At the bottom of this section the script appears in one piece.


    #!/usr/bin/perl
    use strict; use warnings;

These lines are always a good idea.


    my $root = {
        name     => "ROOT",
        level    => -1,
        children => [],
    };

This is the root node. It's a hash reference containing the three keys mentioned earlier. The root node is special. Since it isn't in the file, I give it an artificial name and a level that is lower than anyone else's. (In a moment, we will see that levels in the input will be zero or positive.) Initially the list of children is empty.


    my @stack;
    push @stack, $root;

The stack will keep track of the ancestry of each new node. For starters it needs the root node, which won't ever be popped, because it is an ancestor of all the nodes.


    while (<>) {
        /^(\s*)(.*)/;
        my $indentation = length $1 if defined ($1);
        my $name        = $2;

To read the file, I chose a magic while. For each line there will be two parts: the indentation (the leading spaces) and the name (the rest of the line). The regular expression captures any leading space into $1 and everything else (except the new line) into $2. The length of the indentation is the important part, the bigger this is the more ancestors the node has. Lines starting at the margin have an indentation of 0 (which is why the ROOT has a level of -1).


        while ($indentation <= $stack[-1]{level}) {
            pop @stack;
        }

This loop handles ancestry. It pops the stack, until the node on top of the stack is the parent of the new node. Think of an example. When Home Center comes along, Cleaners and ROOT are on the stack. Home Center's level is 0 (it's at the margin), so is Cleaners'. Thus, Cleaners is popped (since 0 <= 0). Then only ROOT remains, so popping stops (0 is not <= -1).


        my $node = {
            name     => $name,
            level    => $indentation,
            children => [],
        };

This builds a new node for the current line. It's name and level are set. We haven't seen any children yet, but I make room for them in an empty list.


        push @{$stack[-1]{children}}, $node;

This line adds the new node to its parent's list of children. Remember that the parent is sitting on top of the stack. The top of the stack is $stack[-1] or the last element in the array.


        push @stack, $node;
    }

This pushes the new node onto the stack, in case it has children. The closing brace ends the magic while loop. For simplicity, I chose to display the output with Data::Dumper:


    use Data::Dumper; print Dumper($root);

Running this shows the tree (sideways) on standard out.

Here's the whole code without interruption:


    #!/usr/bin/perl
    use strict; use warnings;

    my $root = {
        name     => "ROOT",
        level    => -1,
        children => [],
    };

    my @stack;
    push @stack, $root;

    while (<>) {
        /^(\s*)(.*)/;
        my $indentation = length $1;
        my $name        = $2;
        while ($indentation <= $stack[-1]{level}) {
            pop @stack;
        }
        my $node = {
            name     => $name,
            level    => $indentation,
            children => [],
        };
        push @{$stack[-1]{children}}, $node;
        push @stack, $node;
    }

    use Data::Dumper; print Dumper($root);

I promised to explain how structures like the one above can be turned into pictures. The CPAN module UML::Sequence builds a structure similar to the one shown here. It then uses that to generate a UML Sequence diagram of the steps in SVG (Scalable Vector Graphics) format. That format can be converted with standard tools like Batik to PNG or JPEG. In practice the outlines which I turn into pictures represent call sequences for programs. Perl can even generate the outline by running the program. See UML::Sequence for more details.

When you have some interesting structured input, a builder might help make a good internal structure. One high value builder is XML::DOM. Another with a slightly different approach is XML::Twig. It is not coincidental that XML parsers are really builders, as XML files are non-binary trees.

Interpreter

If you haven't looked in GoF yet, start with the interpreter pattern. Laughter is good for the soul. The person who taught me patterns in Java did not even know why this pattern would not work in practice. He had heard it was somewhat slow, but he wasn't sure. Well I'm sure.

Luckily for us, Perl has alternatives. These range from quick and dirty to full blown. Here's the litany covered with examples below:

  • split
  • eval'ing Perl code
  • Config::Auto
  • Parse::RecDescent

Since we already have a language we like (that's Perl for those who haven't been paying attention), interpreting is limited to small languages that do something for us. Usually these turn out to be configuration files, so I will focus on those. (See the builder section above if a tree can represent your data file.)

Splitting

The easiest route involves split. Suppose I have a config file which uses variable=value settings. Comments and blanks should be ignored, all other lines should have a variable, value pair. That's easy:


    sub parse_config {
        my $file = shift;
        my %answer;

        open CONFIG, "$file" or die "Couldn't read config file $file: $!\n";
        while (<CONFIG>) {
            next if (/^#|^\s*$/);  # skip blanks and comments
            my ($variable, $value) = split /=/;
            $answer{$variable} = $value;
        }
        close CONFIG;

        return %answer;
    }

This subroutine expects a config file name. It opens and reads that file. Inside the magic while loop the regex rejects lines which start with '#' and those which contain only whitespace. All other lines are split on '='. The variables become keys in the %answer hash. When all the lines are read, the caller gets the hash back.

You could go much further along these lines, but see below for those who've gone before you (see especially Config::Auto).

Evaluating Perl Code

My current favorite way to bring configuration information into a Perl program is to specify the config file in Perl. So, I might have a config file like this:


    our $db_name = "projectdb";
    our $db_pass = "my_special_password_no_one_will_think_of";
    our %personal = (
        name    => "Phil Crow",
        address => "philcrow2000@yahoo.com",
    );

To use this in a Perl program all I have to do is eval it:


    ...
    open CONFIG, "config.txt" or die "couldn't...\n";
    my $config = join "", <CONFIG>;
    close CONFIG;

    eval $config;
    die "Couldn't eval your config: $@\n" if $@;
    ...

To read the file, I open it, then use join to put the angle read operator in list context. This lets me bring the whole file into a scalar. Once it's in (and the file is closed for tidiness), I just eval the string I read. I need to check $@ to make sure the file was good Perl. After that, I'm ready to use the values just as if they appeared in the program originally.

Config::Auto -- For Those Who Can't be Bothered

If you're too lazy to write your own config handler, or if you have lots of configs outside your control, Config::Auto may be for you. Basically, it takes a file and guesses how to turn it into a config hash. (It can even guess the name of your config file). Using it is easy (if it works):


    #!/usr/bin/perl
    use strict; use warnings;

    use Config::Auto;

    my $config = Config::Auto::parse("your.config");
    ...

What ends up in $config depends on what your config file looks like (shock). For files which use variable=value pairs, you get what you expect, which is exactly what the first example above generates for the same input. It is possible to specify a config file that Config::Auto cannot understand (shock and amazement).

Real Hackers Use Parse::RecDescent

If the file you need to parse is complex, consider Parse::RecDescent. It implements a clever top/down parser scheme. To use it, you specify a grammar. (You remember grammars, don't you? If not, see below.) It builds a parser from your grammar. You feed text to the parser. It does whatever the grammar specifies in its actions.

To give you a feel for how this works, I'll parse small Roman numerals. The program below takes numbers from the keyboard and translates them from Roman numerals to decimal integers, so XXIX becomes 29.


    #!/usr/bin/perl
    use strict; use warnings;

    use Parse::RecDescent;

    my $grammar = q{
        Numeral : TenList FiveList OneList /\Z/
                    { $item[1] + $item[2] + $item[3]; }
                | /quit/i { exit(0); }
                | <error>

        TenList : Ten(3)                  { 30            }
                | Ten(2) OptionalNine     { 20 + $item[2] }
                | Ten OptionalNine        { 10 + $item[2] }
                | OptionalNine            { $item[1]      }

        OptionalNine : One Ten { 9 }
                     |         { 0 }

        FiveList : One Five { 4 }
                 | Five     { 5 }
                 |          { 0 }

        OneList : /(I{0,3})/i { length $1 }

        Ten : /X/i

        Five : /V/i

        One : /I/i
};

my $parse = new Parse::RecDescent($grammar);

while (<>) { chomp; my $value = $parse->Numeral($_); print ``value: $value\n''; }

As you can see $grammar takes up most of the space in this program. The rest is pretty simple. Once I receive the parser from the Parse::RecDescent constructor, I just call its Numeral method repeatedly.

So what does the grammar mean? Let's start at the top. Grammars are built from rules. The rule for a Numeral (the Roman kind) says:


    A Numeral takes the form of one of these choices
        a TenList then a FiveList then a OneList then the end of the string
        OR
        the word quit in any case (not a Numeral, but a way to quit)
        OR
        anything else, which is an error

We'll see what TenList and its friends are shortly. The code after the first choice is called an action. If the rule matches a possibility, it performs that possibility's action. So if a valid Numeral is seen, the action is executed. This particular action adds up the values TenList, FiveList, and OneList have accumulated. The items are numbered starting with 1, so TenList's value is in $item[1], etc.

How does TenList get a value? Well, when Numeral starts matching, it looks first for a valid TenList. There are four choices:


    A TenList takes the form of one of these choices
        three Tens
        OR
        two Tens then an OptionalNine
        OR
        a Ten then an OptionalNine
        OR
        an OptionalNine

These choices are tried in order. A Ten is simply an upper- or lower-case X (see the Ten rule). The result of an action is the result of its last statement. So, if there are three tens, the TenList returns 30. If there are two tens, it returns 20 plus whatever OptionalNine returned.

The Roman numeral IX is our 9. I call this an OptionalNine. (The names are completely arbitrary.) So after zero, one, or two X's, there can be an IX which adds 9 to the total. If there is no IX, the OptionalNine will match the empty rule. That consumes no text from the input and returns zero according to its action.

Roman numerals are a lot more complex than my little grammar can handle. For starters, by my calendar, we're now in the year MMIII. There are no M's in my grammar. Further, some Romans thought that IIIIII was perfectly valid. In my grammar three is the limit for all repetitions, and only I and X can repeat. Further, reductions can only take one away. So, IIX is not eight, it's invalid. This grammar can recognize any normalized Roman numeral up to 38. Feel free to expand it.

Parse::RecDescent is not as fast as a yacc-generated parser, but it is easier to use. See the documentation in the distribution for more information, especially the tutorial which originally appeared in The Perl Journal.

If you look at what's inside the parser (say with Data::Dumper) you might think this actually implements the interpreter pattern. After all, it makes a tree of objects from the grammar. Look closer and you will see the key difference. All of the objects in the tree are members of classes of like Parse::RecDescent::Action, which were written by Damian Conway when he wrote the module. In the GoF interpreter pattern we are expected to build a class for each non-terminal in the grammar (above those classes would be Numeral, ReducedTen, etc.). Thus, the tree node types are different for each grammar.

This difference has two implications: (1) it makes the RecDescent parser generator simpler and (2) it's result faster.

Summary

In this installment we have seen how to use code references to implement the Strategy and Template Method patterns. We even saw how to force our code into someone else's class. Builder turns text into an internal structure, which most Interpreters also do. Those structures can often be simple combinations of hashes, lists, and scalars. If what you need to read is simpler, use split or Config::Auto. If it is more complex, use Parse::RecDescent. If that won't do it fast enough, you might need one of the yaccs.

Next time I will look at patterns which actually rely on objects.

This week on Perl 6, week ending 2003-08-03

"Ooh look, it's another Perl 6 summary. Doesn't that man ever take a holiday?"
"I think he took one last month."
"Is it in Esperanto this week?"
"I don't think so."
"Does Leon Brocard get a mention?"
"It certainly looks that way."
"Does is start with the internals list again?"
"I think it does, in fact, here it comes now."

Approaching Python

Discussions (and coding) of the Parrot implementation of Python continued this week. Michal Wallace is working on taking a preexisting (but incomplete, it's a proof of concept only) python parse tree -> parrot code generator and rejigging it to generate code for IMCC. Assuming the initial rejig is doable, Michal surmises that getting a complete python compiler will be 'just' a matter of fleshing out the rest of the visitor methods, 'and then dealing with the C-stuff.'

Actually, the main strand of this discussion dealt with ways of extending IMCC to help optimize the translation of stack based code to code that uses registers more efficiently (register unspilling as Benjamin Goldberg called it), which should help with any bytecode translator based efforts.

http://groups.google.com/groups

http://groups.google.com/groups

Semantics of clone for PIO-objects

Jürgen Bömmels' work on the Parrot IO system continues to find edge cases in Parrot's memory management subsystem. As initially written, a clone call adds a reference to a ParrotIO object, but that object is neither garbage collected nor refcounted, and it gets destroyed when its first reference is destroyed. The problem can be solved by allocating a new ParrotIO in the clone call, but Jürgen had some questions about how to handle various things like what to do with buffering or position pointers.

Jos Visser offered a suggestion involving indirection and reference counting which met with (at least) Melvin Smith's approval.

http://groups.google.com/groups

Making 'make' less verbose

Leo Tötsch checked in a patch to make make's output rather less verbose. After the patch, the make process only echos the name of the file being compiled, and doesn't spam your terminal with the entire compiler commandline (the compiler warnings do that). Some people liked this. Others didn't.

http://groups.google.com/groups

Don't trace system areas in sweep ops

One of the things we discussed at the Parrot BOF was how to solve the 'bogus objects' problem when doing timely destruction sweeps (The 'bogus objects' problem is when the stack walk code detects chimerical objects through holes in the C stack (hmm... if anyone has a good drawing of this?)). After much discussion we came to the conclusion that the trick was to only walk the system stack during DOD (Dead Object Detection) runs that got triggered via resource starvation.

This works because "There is nothing unanchored and alive beyond the runloop's stack". Brent Dax was impressed, but then, he wasn't at the BOF so he doesn't know how long it took us to get to the answer.

http://groups.google.com/groups

User defined events

Klaas-Jan Stol wondered if there would be some way of generating and posting user defined events. Uri Guttman thought that there probably would be.

http://groups.google.com/groups

PHP/Parrot

The language implementation insanity continues!

Stephen Thorne announced that he's working on implementing a PHP parser and is seriously considering targetting Parrot. He asked for pointers to good docs on how to go about doing so. He worried a little about bootstrapping as well.

Luke Palmer and Simon Glover were forthcoming with lots of useful answers and pointers.

http://groups.google.com/groups

Why Parrot uses Continuation Passing Style

In a delayed response to a question from Klaas-Jan Stol, Dan has posted a long message on the reasons for choosing Continuation Passing Style as Parrot's calling convention. It's definitely worth the read if you're at all interested in the reasoning behind Parrot (and the reason that my copy of Perl 6 Essentials has a signed correction from Dan).

http://groups.google.com/groups

IMCC supports the Parrot Calling Conventions

Leo announced that IMCC's brand of assembler, PIR (I can't remember what it stands for, Parrot Intermediate Representation perhaps). Of course, there are things it doesn't quite do yet (tail call detection, only preserving 'necessary' registers...) and it's somewhat lacking on the test front, but it's a start. Yay Leo!

http://groups.google.com/groups

Another task for the interested: Coverage

Dan threw out another 'task for the interested' this week. At present we don't have a complete set of coverage tests for the parrot opcodes, nor do we even know why opcodes do have coverage. Volunteers to fix this state of affairs (start with a coverage report being generated as part of the make test run) would be very welcome.

It turns out that Leo already has an "unportable, ugly, slightly tested" script generating a coverage report of sorts which he posted. Josh Wilmes also has a coverage testing tool generating reports on the web, but he'd turned it off following some problems under testing.

http://groups.google.com/groups

http://www.hitchhiker.org/parrot_coverage -- Josh's reports

Pirate (py...rrot)

Will the terrible jokes never stop?

Michal Wallace reported to the list on his attempts to retool amk's parrot-gen.py to generate code for IMCC. It sounds like he's making good progress, but his choice of codename -- Pirate, from py...rrot -- had at least one summarizer groaning.

Later Michal asked the list about the best way of generating subroutines and asked for some pointers about how best to arrange the generated code. He also let slip that Pirate could handle Lists, strings, and ints; assignments; control structures; maths; boolean logic; and comparisons...

Leo came up with a suggestion about code layout for Michal and spoke for everyone (I think) when he added:

"Wow."

Luke Palmer offered a few performance tuning tips (the parrot of Python translation is currently running 3 times slower than python on euclid, but I'm sure we'll get that fixed soon enough.

http://groups.google.com/groups

http://groups.google.com/groups

JVM->PIR translator

Just as we were all giving Michal some good Wow, Joseph Ryan announced that he had a partially complete JVM->PIR translator done, though it still had a few issues.

http://groups.google.com/groups

http://jryan.perlmonk.org/images/jirate.tar.gz

Dynamic PMC Classes

Leo announced that he's started working on dynamic PMC classes. The idea is that PMCs could be loaded on demand, in a similar fashion (though hopefully with a nicer interface) to Perl 5's DynaLoader tricks. He already has something working, and asked for comments.

Dan responded by outlining his thoughts on the interfaces and requirements for dynamic PMC loading, which weren't quite the same as what Leo had implemented, but they don't call it software for nothing.

Christian Renz wondered if there were any plans to allow PMCs to be implemented in Parrot assembly. Dan confirmed that there were.

http://groups.google.com/groups

Question about interpreter == NULL

Jürgen Bömmels wondered which functions allowed the caller to pass in a NULL pointer in place of the interpreter. Some functions allow this, others fall in big segfaulty heaps. He and Leo thrashed out the details of what is and isn't allowed, hopefully this will make it into documentation at some point.

http://groups.google.com/groups

Adding yield semantics to IMCC

Kenneth A Graves has been experimenting with the .pcc_* directives for implementing function calls, and wants to add coroutine support by implementing .pcc_begin_yield and .pcc_end_yield which would be analogous to the current .pcc_*_return directives. He supplied a patch implementing what he was after. Leo liked the patch and applied it.

http://groups.google.com/groups

IMCC objects speed, .include, file-scoped vars et cetera

Now that Parrot nearly has objects, Jerome Quelin has started work on a new version of Befunge in IMCC. This meant he had a pile of questions about speed, file scoping of variables, problems with line numbering within included files, and fragility in the absence of newline termination.

Melvin Smith opined that the time had come to start putting together a nice web based set of docs for IMCC, and volunteered to start work on it himself as soon as he'd caught up with the current state of the IMCC art.

Leo Tötsch meanwhile answered most of Jerome's questions.

http://groups.google.com/groups

IMCC's call vs first class functions

If you were still not sure of the virtues of Continuation Passing Style in Parrot, then Michal Wallace's problems with making first class function objects in Pirate might help convince you of their virtue. As far as I can see, just switching to a CPS style should mean that anonymous functions in Pirate become almost automatic. (I could be wrong of course)

http://groups.google.com/groups

David Adler Scares Himself

For reasons best known only to himself Dave Adler has implemented an hq9+ interpreter in pasm. Quite what an hq9+ interpreter is was left as an exercise for the interested reader. Having just now done the Google search for the language, I think it's best if I leave it to you to do the search yourself, but quite frankly, I wouldn't bother.

http://groups.google.com/groups

Upcoming backwards incompatible changes to IMCC

Leo Tötsch announced some changes to IMCC which will mean it is no longer backwards compatible. What's changing is that from now, all code outside of compilation units will be ignored, which means that nested subroutines will no longer be supported. He will also be adding a new .globalconst directive for declaring file scoped constants.

http://groups.google.com/groups

Embedding Parrot

Jeff Horwitz is interested in embedding parrot in other programs, and wanted to know if there was any prior art, or even a road map. So far there's been no response.

http://groups.google.com/groups

Meanwhile, in perl6-language

Things are starting to warm up a little in perl6-language following the publication of Exegesis 6 (take a look, you'll find it at http://www.perl.com/pub/a/2003/07/29/exegesis6.html, there's much good stuff in there; Perl 6 is starting to look like a real language I tell you). The volume's not got up to post-Apocalypse 6 levels yet, but it's early days yet.

Perl 6's for() signature

John Siracusa referred back to an earlier summary where I had wondered if either of two for implementations had got the signature quite right. Luke Palmer (perpetrator of one of the for implementations) thought that it wasn't quite possible to come up with an accurate signature for for (or at least, not one that could tell the compiler enough to detect errors at compile time.) because you essentially needed a slurpy list followed by a block, but slurpy lists have to be the last parameter in the signature.

John countered by quoting from Exegesis 6 "An important goal of Perl 6 is to make the language powerful enough to implement all its own built-ins", which doesn't exactly contradict what Luke said, as there's always the possibility of implementing something for-like using a macro, but that doesn't feel too comfortable.

Rod Adams proposed "non-greedy slurpy arrays" which would be analogous to non-greedy regex matches and proposed *?@ as the sigil combination for such a parameter. (Perl? Line noise? Never!)

Damian Conway tweaked Simon Cozens' "Soylen^WPerl 6 is Ruby!" detector when he mentioned that Larry was considering adding a special case for allowing a single &block parameter after a slurpy parameter, but that both Larry and Damian weren't entirely happy with the idea.

Larry offered words of wisdom. As usual.

http://groups.google.com/groups

http://www.perl.com/pub/a/2003/04/p6pdigest/20030427.html -- the earlier summary

http://groups.google.com/groups -- Larry dispenses wisdom

Exegesis 6: Assume nothing

Referring to Exegesis 6, Trey Harris wondered how one could curry a subroutine to always use the default value for the 'assumed' parameter. He wanted to be able to created a curried function in such a way that, if the original function's default value changed, the curried function would reflect that.

I don't think this thread has been resolved to anyone's satisfaction yet, and I can't quite tell where it's headed. My gut feeling is that this is a sufficiently rare requirement that Damian's solution of not using .assuming at all and just writing a simple wrapper function by hand may be the way forward.

http://groups.google.com/groups

Mandating name-only parameters

Mark J. Reed wondered if the new parameter declaration syntax meant it was possible to declare a mandatory name-only parameter. Damian thinks it will probably be doable, but only by using traits rather than the single character prefixes.

http://groups.google.com/groups

Small Junctions

Exegesis 6 describes a junction as "a single scalar value that can act like two or more values at once". Dave Whipp wondered how junctions with 0 or 1 members would behave. As Dave said, the case of a single member junction is relatively easily to understand, but he's unsure as to the semantics for a 0 member junction.

Luke Palmer gave a good answer, and pointed at Damian's message about "Perl 6 and Set Theory" for more detail.

http://groups.google.com/groups

http://groups.google.com/groups

Macros and is parsed

Abhijit A Mahabal asked for some clarification about the workings of macros, in particular how/when macro arguments were parsed. The answer from Larry appears to be that macros get a default parsed trait, which can be overridden by is parsed when the macro is declared, so macro arguments are parsed by the macro's parsed trait.

http://groups.google.com/groups

Macro arguments themselves

Luke Palmer wondered what macros do about being passed variables, with a supplementary question about recursive macros. Larry answered that macros dealt with their arguments in the way that Luke hoped (it'd be a disaster if the didn't, frankly), but that to get a recursive macro you would probably have to write a helper function.

http://groups.google.com/groups

Another macro question

Abhijit A. Mahabal wondered what


    macro foo() { return { my $x = 7 } }
    foo;
    print $x;

would be equivalent to. According to Larry, the answer is probably:


    do { my $x = 7 }
    print $x;

Which would throw an error under use strict. It seems to me that the way to get expanded code that looks like:


    my $x = 7;
    print $x;

would be to declare foo as:


    macro foo() { return 'my $x = 7' }

http://groups.google.com/groups

grep EXPR, LIST

John Williams wondered if the Perl 5ish grep EXPR, LIST would still work in Perl 6. Larry thinks not. I think it should be possible to declare an appropriate macro version of grep, but the margins of this summary are too narrow to contain my solution.

http://groups.google.com/groups

Acknowledgements, Announcements and Apologies

Thanks to Damian for Exegesis 6, Perl 6 may be slow in coming, but I like it more with each revelation.

Ooh look, another plug for http://pc1.bofhadsl.ftech.co.uk:8080/.

As ever, if you've appreciated this summary, please consider one or more of the following options:

Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en