May 2003 Archives

Hidden Treasures of the Perl Core

The Perl Core comes with a lot of little modules to help you get thejob done. Many of these modules are not well-known. Even some of the well-known modules have some nice features that are often overlooked. In this article, we'll dive into many of these hidden treasures of the Perl Core.

blib

This module allows you to use MakeMakers to-be-installed version of a package. Most of the distributions on the CPAN conform to MakeMakers building techniques. If you are writing a Perl module that has a build system, then there would be a good chance MakeMaker is involved. Testing on the command line is common; I know I find myself doing it often. This is one of the places that blib comes in handy. When running my test suite (you all have test suites, right?) on the command line, I'm able to execute individual tests easily.


  perl -Mblib t/deepmagic.t

If you are building someone elses module and find yourself debugging a testing failure, then blib could be used the same way.

diagnostics

PC Load Letter, what the frell does that mean?! -- Micheal Bolton

When pushed hard enough, the Perl interpreter can spew out hundreds of error messages. Some of them can be quite cryptic. Running the following code snippet under the warnings pragma yields the warning Unterminated <> operator at program.perl line 11.


  $i <<< $j;

Thankfully, diagnostics is an easy way to get a better explanation from Perl. Since we're all running our important programs under the strict and warnings pragmas, it's easy to add diagnostics to the mix.


  use strict;
  use warnings;
  use diagnostics;

The previous code snippet now yields the following warning:


  Unterminated <> operator at -e line 1 (#1)
    (F) The lexer saw a left-angle bracket in a place where it was expecting
    a term, so it's looking for the corresponding right-angle bracket, and
    not finding it.  Chances are you left some needed parentheses out
    earlier in the line, and you really meant a "less than".

  Uncaught exception from user code:
        Unterminated <> operator at program.perl line 11.

Use of the diagnostics pragma should be kept to development only (where it's truly useful).

Benchmark

It can be difficult to benchmark code. When trying to optimise a program or routine, you want to try several approaches and see which comes out faster. That's what the Benchmark module is for. This way, you don't have to calculate start and stop times yourself, and in general you can do high-level profiling quickly. Here is an example that tries to determine which is faster, literal hash slices or retrieving hash values one at a time.


  use Benchmark;

  sub literal_slice {
    my %family = (
      Daughter => 'Evilina',
      Father => 'Casey',
      Mother => 'Chastity',
    );
    my ($mom, $dad) = @family{qw[Mother Father]};
  }

  sub one_at_a_time {
    my %family = (
      Daughter => 'Evelina',
      Father => 'Casey',
      Mother => 'Chastity',
    );
    my $mom = $family{Mother};
    my $dad = $family{Father};
  }

  timethese(
    5_000_000 => {
      slice       => \&literal_slice,
      one_at_time => \&one_at_a_time,
    },
  );

On the hardware I have at work, a dual G4 PowerMac, the answer seems obvious. Being cute and clever doesn't hurt us too badly. Here is the output.


  Benchmark: timing 5000000 iterations of one_at_time, slice...
  one_at_time: 53 wallclock secs (53.63 usr +  0.00 sys = 53.63 CPU) 
         @ 93231.40/s (n=5000000)
        slice: 56 wallclock secs (56.72 usr +  0.00 sys = 56.72 CPU) 
         @ 88152.33/s (n=5000000)

CGI::Pretty

Many of you know you can use Perl to write your HTML, in fact, this trick is often used in CGI programs. If you have used the CGI module to create HTML, then it would be obvious that the output is not intended for humans to parse. The ``browser only'' nature of the output makes debugging nearly impossible.


  use CGI qw[:standard];

  print header,
    start_html( 'HTML from Perl' ),
    h2('Writiing HTML using Perl' ),
    hr,
    p( 'Writing HTML with Perl is simple with the CGI module.' ),
    end_html;

The previous program produces the following incomprehensible output.


  Content-Type: text/html; charset=ISO-8859-1
  
  <?xml version="1.0" encoding="iso-8859-1"?>
  <!DOCTYPE html
          PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
           "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">;
  <html xmlns="http://www.w3.org/1999/xhtml"; lang="en-US">
  <head><title>HTML from Perl</title></head><body><h2>Writing 
  HTML using Perl</h2><hr /><p>Writing HTML with Perl is simple with the 
  CGI module.</p></body></html>

By changing the first line to use CGI::Pretty qw[:standard];, our output is now manageable.


  Content-Type: text/html; charset=ISO-8859-1
  
  <?xml version="1.0" encoding="iso-8859-1"?>
  <!DOCTYPE html
          PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
           "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">;
  <html xmlns="http://www.w3.org/1999/xhtml"; lang="en-US">
  <head><title>HTML from Perl</title>
  </head><body>
  <h2>
          Writing HTML using Perl
  </h2>
  <hr><p>
          Writing HTML with Perl is simple with the CGI module.
  </p>
  </body></html>

While not as attractive as I'd like, there are lots of customizations to be made, all outlined in the CGI::Pretty documentation.

Class::ISA

The world of class inheritance is a complex and twisting maze. This module provides some functions to help us navigate the maze. The most common need is for the function super_path(). When dealing with complex OO hierarchies, super_path() can help us know which classes we're inheriting from (it isn't always obvious), and find method declarations.

I have a little project that requires Class::DBI, so I ran super_path() on one of the classes to determine how Perl would search the inheritance tree for a method.


  perl -MJobSearch -MClass::ISA -le'print for 
      Class::ISA::super_path( "JobSearch::Job" )'

The following list of classes is in the order Perl would search to find a method.


  JobSearch::Object
  Class::DBI::mysql
  Class::DBI
  Class::DBI::__::Base
  Class::Data::Inheritable
  Class::Accessor
  Ima::DBI
  Class::WhiteHole
  DBI
  Exporter
  DynaLoader

Now if I have a question about a method implementation, or where methods are coming from, I have a nice list to look through. Class::ISA intentionally leaves out the current class (in this case JobSearch::Job), and UNIVERSAL.

Here is a little trick that allows me to find out which classes may implement the mk_accessors method.


  perl -MJobSearch -MClass::ISA -le \
    'for (Class::ISA::super_path( "JobSearch::Job" )) { 
	    print if $_->can("mk_accessors") }'

Because of inheritance, all of the classes listed can invoke mk_accessors, but not all of them actually define mk_accessors. It still manages to narrow the list.

Class::ISA was introduced to the Perl Core in release 5.8.0. If you're using an older Perl, you can download it from the CPAN.

Cwd

This module makes it simple to find the current working directory. There is no need to go to the shell, as so many of us do. Instead, use Cwd.


  use Cwd;
  my $path = cwd;

Env

Perl provides access to environment variables via the global %ENV hash. For many applications, this is fine. Other times it can get in the way. Enter the Env module. By default, this module will create global scalars for all the variables in your environment.


  use Env;
  print "$USER uses $SHELL";

Some variables are of better use as a list. You can alter the behavior of Env by specifying an import list.


  use Env qw[@PATH $USER];
  print "$USER's  path is @PATH";

Yet another module to save time and energy when writing programs.

File::Path

This module has a useful function called mkpath. With mkpath you can create more than one level of directory at a time. In some cases, this could reduce a recursive function or a loop construct to a simple function call.


  use File::Path;
  mkpath "/usr/local/apache/htdocs/articles/2003";

Since mkpath will create any directory it needs to in order to finally create the 2003 directory, a tremendous amount of code is no longer needed.

File::Spec::Functions

This module implements a sane and useful interface over the File::Spec module. File::Spec must be used by calling class methods, while File::Spec::Functions turns those methods into functions. There are many functions that are all useful (and fully documented in File::Spec::Unix). Here are a few examples.


  use File::Spec::Functions qw[splitpath canonpath splitdir abs2rel];

  # split a path into logical pieces
  my ($volume, $dir_path, $file) = splitpath( $path );
  
  # clean up directory path
  $dir_path = canonpath $dir_path;

  # split the directories into a list
  my @dirs = splitdir $dir_path;

  # turn the full path into a relative path
  my $rel_path = abs2rel $path;

As you can see, there are plenty of ways to save yourself coding time by using File::Spec::Functions. Don't forget, these functions are portable because they use different symantecs behind the senses for the operating system Perl is running on.

File::Temp

If you need a temporary file, then use File::Temp. This module will find a temporary directory that is suitable for the operating system Perl is running on and open a temporary file in that location. This is yet another example of the Perl Core saving you time.


  use File::Temp;
  my $fh = tempfile;
  
  print $fh "temp data";

This will open a temporary file for you and return the filehandle for you to write to. When your program exits, the temporary file will be deleted.

FindBin

FindBin has a small but useful purpose: to find the original directory of the Perl script being run. When a program is invoked, it can be hard to determine this directory. If a program is calling chdir, then it can be even more difficult. FindBin makes it easy.


  use FindBin;
  my $program_dir = $FindBin::Bin;

Shell

Shell takes the ugliness of dealing with the command line and wraps it up in pretty functions. The effect here is prettier programs. Here is a simple demonstration.


  use Shell qw[ls du];
  use File::Spec::Functions qw[rel2abs];

  chomp( my @files = ls );
  foreach ( @files ) {
        print du "-sk", rel2abs $_;
  }

Time::localtime

This module allows localtime to return an object. The object gives you by-name access to the individual elements returned by localtime in list context. This doesn't save us much coding time, but is can save us a trip to the documentation.


  use Time::localtime;
  my $time = localtime;
  print $time->year += 1900;

There is a similar module called Time::gmtime, which provides the same functionality for the gmtime function.

UNIVERSAL

The UNIVERSAL module is handy. Two of its most common functions, isa and can are almost always used in OO programming as methods. isa is used to determine what class an object belongs to, and can will tell us whether an object supports a method. This is useful for testing. For example.


  use Time::localtime;
  my $time = localtime;

  if ( $time->isa( 'Time::localtime' ) ) {
    print "We have a Time::localtime object";
  }
  
  if ( $time->can( "year" ) ) {
    print "We can get the year from our object";
  }

Another less-known function in UNIVERSAL is VERSION. I often need to know the version of an installed module and I find myself writing a one-liner like:


  perl -MTest::More -le'print $Test::More::VERSION'

That's just not as pretty as this.


  perl -MTest::More -le'print Test::More->VERSION'

Conclusion

The Perl Core has many hidden wonders, and I've just laid out a few here. Trolling the Core for interesting functions and modules has saved me a lot of work over the years. If you would like to look further, then browse the perlmodlib manpage for a list of the core modules. Whether your interest is CGI, I18N, Locale, or Math, you can find something there that saves a few hours of work.

This week on Perl 6, week ending 2003-05-25

Welcome back to another Perl 6 summary in which Piers comes back from watching a load of pretty girls dancing 'round a huge Maypole in Wellow (it's 60 feet high, the tallest permanent Maypole in England, there's been on on the site since at least 1856. One does wonder if the Lord of the Manor who put the first one up was compensating for something). It's not my favourite English traditional event (you'd have to go a long way to beat the Whitby Penny Hedge for weirdness or the Brockworth Hill cheese-rolling for complete and utter reckless insanity (it was called off this year because the people who usually provide cover in case of accidents are in Algeria helping deal with the aftermath of the earthquake there; St John's Ambulance didn't feel up to the rigours of dealing with the kind of injuries that occur when too many people chase a Double Gloucester cheese down a 1 in 2 slope.)) but it made for a lovely afternoon.

Which is why, instead of having this written by mid afternoon on a Monday, I'll be lucky if I have it finished before Tuesday. Ah... the trials of a summary writer.

So, we'll start with perl6-internals because that's what we always do.

IMCC variable names

Will Coleda discovered that IMCC didn't allow him to have a variable name that happened to be the same as an op, which isn't what the spec says and which could lead to interesting problems where IMCC source code suddenly becomes invalid when someone adds a new op to the language. Will sent a patch to fix the docs. Leo Tötsch didn't apply the patch, choosing to fix the problem instead. Yay Leo.

http://groups.google.com/groups

Vtables get macroized

Leo Tötsch finished checking a series of patches which switch the Parrot source code over from accessing vtable innards directly to accessing them via macros, thus making it easier to monkey with vtable internals without having to change a million and one different source files.

http://groups.google.com/groups

The timely destruction thread

After last week's discussion of timely destruction, the debate continued this week. Dan announced his design for dealing with the issue. PMCs now get a flag marking them as requiring timely destruction and there's a new op to trigger a DOD run if and only if there are any flagged PMCs in memory. The idea is that compilers for languages that care about timely destruction would insert the lazysweep op into their generated code in the appropriate places. Mr. Zellyn Hunter (hey, that's what he asked to be called, who am I to argue?) suggested a different name for Dan's proposed flag and offered the following joke which I repeat here in its entirety because it's funny (because it's *true*).

Q: How many Parrot programmers does it take to screw in a lightbulb?

A: Just one, as long as it's Leo Tötsch. He will probably also rewire your entire electrical system slightly, saving you about 3% in power consumption during the evenings and on weekends.

Leo denied the truth of this as lightbulbs are hardware and he doesn't do hardware.

Meanwhile, back at the thread...

Guess what? People didn't like the solution. I think this is a classic problem of GC. Everyone agrees that real Garbage Collection is a good idea, but they also want it to work by magic and take no time at all. Sadly, whilst Leo is good, even he can't change the laws of physics. My suggestion: if you're currently writing Perl code that relies on magic to close a filehandle the instant it goes out of scope then Don't Do That, close it yourself. That way, when you come to switch things over to Perl 6 you can do use GC scope_end => undef (or something) and get blistering performance and real GC.

http://groups.google.com/groups

http://groups.google.com/groups

PASM code analysis

Clinton A Pierce has rewritten BASIC's expression evaluator and is reintegrating it with the runtime. This has left a lot of 'junk code' that never actually gets executed, and Clint wanted to know if there was a way of tracking down (and removing) the unreachable code. Leo implied that the answer is 'run it under IMCC' which apparently does dead code detection. This didn't quite work for Clint as IMCC threw an error when it tried to run some BASIC generated PASM (at least, it does when run in dead code detection mode...) Leo wasn't able to help directly with tracking down the issue, but did provide some pointers for conversion of the BASIC compiler so that it targets IMCC's PIR language instead of simple Parrot assembly. This prompted a list of questions from Clint about IMCC and Luke Palmer provided a some good answers.

http://groups.google.com/groups

http://groups.google.com/groups

Perl 6 Essentials

Sean O'Rourke noticed the new Perl 6 book in O'Reilly's catalogue and wondered where Leo found the time to become an author as well as a coding machine. Dan popped up to note that the plan is to have the book available in time for OSCON and YAPC::Europe. (Be there or be somewhere else). Randal Schwartz mentioned that he'd been a tech reviewer for the book and had found it 'quite an interesting read'.

http://groups.google.com/groups

http://www.oreilly.com/catalog/perl6es/ -- Americans can't spell catalogue

A New PMC Layout

Leo Tötsch posted a proposed new layout for PMCs now that PMC access has been hidden behind macros and asked for comments on his proposed new scheme for the new PMC layout. Dan commented on this, and between them he and Leo appear to have hashed out a way forward. Leo kicked out a patch using the new PMC scheme, with a second version not long after.

http://groups.google.com/groups

http://groups.google.com/groups

http://groups.google.com/groups


Meanwhile in perl6-language

Things have started to pick up a little after the virtual silence of the past few weeks.

Perl 6 Tutorial?

Dulcimer wondered if anyone was working on a simple introductory tutorial for Perl 6 yet. He thought that writing one might help get him up to speed, especially if people on the list critiqued it as he wrote it. Michael Lazzaro pointed to the documentation list perl6-documentation@perl.org which is apparently rather quiet at the moment. Dave Whipp wasn't sure that the time was right to write a tutorial yet as we don't yet know enough about the final language (Personally I think anyone attempting to write a Perl 6 tutorial before the object system gets designed/documented is going to end up having to do some fairly dramatic rewriting at some point).

Anyhoo, this led to a string of suggestions trying to sum up the Spirit of Perl 6, one of which managed to be a spoiler for Buffy the Vampire Slayer...

Simon Cozens also nodded towards the forthcoming Perl 6 Essentials which may be just what Dulcimer is after.

http://groups.google.com/groups

Coroutines

Discussion of Coroutines in Perl 6 was the thread that ate the list this week. John Macdonald, had a few observations about using coroutines and suggestions for the best semantics for them in Perl 6. There seems to be a split between those who want the caller to know it's calling a coroutine and those who want the coroutine to be invisible to the caller. Halfway along the thread, Damian wrote up a proposal which he hoped would make everyone happy. It didn't succeed, but it was at least a step forward as the proposal got batted back and forth a few times, leading to another proposal which got batted back and forth a few more times and which received far more acclaim.

Damian's proposals introduced a new coro declarator for coroutines, which Piers Cawley thought wasn't really necessary, and there was some discussion about whether 'coroutineness' was something that could be applied to all sorts of different Code based things, in which case using new declarators could lead to ugliness (coro, comethod, coblock, corule...) or if there was only really one thing. The jury is out on this.

There was also discussion of possibly unifying coroutine and threading syntax, which some people thought was cool and which others disliked intensely (nothing new there then).

There was also discussion of what a coroutine is, and why you would want to use them. Damian gave a good explanation of some of this. Actually, Damian gave some really good philosophy in this thread about why he likes Perl.

Elsewhere in the thread, some fool called Piers Cawley attempted to introduce a discussion of multimethods and failed dismally to change the subject line. Bad Piers.

http://groups.google.com/groups

http://groups.google.com/groups -- Damian's proposal

http://groups.google.com/groups -- Damian's second proposal

http://groups.google.com/groups -- Damian addresses the ``Why coroutines?'' question

http://groups.google.com/groups -- Damian on why people like Perl

Warnock's Dilemma

The eponymous Bryan C. Warnock (note to any Wired Jargonwatch editors who may be watching, that's Bryan, not 'Brian') popped up to set the record straight about his dilemma. Delightfully, by the end of the week, nobody had replied to him.

http://groups.google.com/groups

Acknowledgements, Announcements and Apologies

Thanks once again are due to all the good people on the Perl 6 lists. Apologies will probably be due to the organizers of YAPC North America as I still haven't started writing the talks I'm supposed to be giving.

If you've appreciated this summary, please consider one or more of the following options:

Testing mod_perl 2.0

Last time, we looked at writing a simple Apache output filter - Apache::Clean - using the mod_perl 2.0 API. How did I know that the filter I presented really worked? I wrote a test suite for it, one that exercised the code against a live Apache server using the Apache-Test testing framework.

Writing a series of tests that executes against a live Apache server has become much simpler since the advent of Apache-Test. Although Apache-Test, as part of the Apache HTTP Test Project, is generic enough to be used with virtually any version of Apache (with or without mod_perl enabled), it comes bundled with mod_perl 2.0, making it the tool of choice for writing tests for your mod_perl 2.0 modules.

Testing, Testing, 1, 2, 3

There are many advantages to writing tests. For instance, maintaining the test suite as I coded Apache::Clean allowed me to test each functional unit as I implemented it, which made development easier. The individual tests also allowed me to be fairly certain that the module would behave as expected once distributed. As an added bonus, tests offer additional end-user documentation in the form of test scripts, supporting libraries and configuration files, available to anyone who wants to snoop around the distribution a bit. All in all, having a test suite increases the value of your code exponentially, while at the same time making your life easier.

Of course, these benefits come from having any testing environment, and are not limited to just Apache-Test. The particular advantage that Apache-Test brings to the table is the ease at which it puts a whole, pristine, and isolated Apache server at your disposal, allowing you to test and exercise your code in a live environment with a minimum of effort. No more Apache::FakeRequest, no more httpd.conf configurations strewn across development environments or corrupted with proof-of-concept handlers that keep you busy following non-bugs for half a day. No more mess, no more tears.

If you have ever used tools like Test.pm or Test::More as the basis for testing your modules, then you already know most of what using Apache-Test is going to look like. In fact, Apache-Test uses Test.pm under the hood, so the layout and syntax are similar. If you have never written a test before, (and shame on you)then An Introduction to Testing provides a nice overview of testing with Perl. For the most part, though, Apache-Test is really simple enough that you should be able to follow along here without any trouble or previous knowledge.

Leveraging the Apache-Test framework requires only a few steps - generating the test harness, configuring Apache to your specific needs, and writing the tests - each of which is relatively straightforward.

Generating the Test Harness

The first step to using Apache-Test is to tweak the Makefile.PL for your module. If you don't yet have a Makefile.PL, or are not familiar with how to generate one, then don't worry - all that is required is a simple call to h2xs, which provides us with a standard platform both for distributing our module and deploying the Apache-Test infrastructure.

  
$ h2xs -AXPn Apache::Clean
Defaulting to backward compatibility with perl 5.9.0
If you intend this module to be compatible with earlier perl versions, then please
specify a minimum perl version with the -b option.

Writing Apache/Clean/Clean.pm
Writing Apache/Clean/Makefile.PL
Writing Apache/Clean/README
Writing Apache/Clean/t/1.t
Writing Apache/Clean/Changes
Writing Apache/Clean/MANIFEST
  

h2xs generates the necessary structure for our module, namely the Clean.pm template and the Makefile.PL, as well as the t/ subdirectory where our tests and supporting files will eventually live. You can take some extra steps and shuffle the distribution around a bit (such as removing t/1.t and putting everything into Apache-Clean/ instead of Apache/Clean/) but it is not required. Once you have the module layout sorted out and have replaced the generated Clean.pm stub with the actual Clean.pm filter from before, it's time to start preparing the basic test harness.

To begin, we need to modify the Makefile.PL significantly. The end result should look something like:

  
#!perl

use 5.008;

use Apache2 ();
use ModPerl::MM ();
use Apache::TestMM qw(test clean);
use Apache::TestRunPerl ();

# configure tests based on incoming arguments
Apache::TestMM::filter_args();

# provide the test harness
Apache::TestRunPerl->generate_script();

# now, write out the Makefile
ModPerl::MM::WriteMakefile(
  NAME      => 'Apache::Clean',
  VERSION   => '2.0',
  PREREQ_PM => { HTML::Clean      => 0.8,
                 mod_perl         => 1.9909, },
);
  

Let's take a moment to analyze our nonstandard Makefile.PL. We begin by importing a few new mod_perl 2.0 libraries. The first is Apache2.pm. In order to peacefully co-exist with mod_perl 1.0 installations, mod_perl 2.0 gives you the option of installing mod_perl relative to Apache2/ in your @INC, as to avoid collisions with 1.0 modules of the same name. For instance, the mod_perl 2.0 Apache::Filter we used to write our output filter interface would be installed as Apache2/Apache/Filter.pm. Of course, ordinary calls that require() or use() Apache::Filter in mod_perl 2.0 code would fail to find the correct version (if one was found at all), since it was installed in a nonstandard place. Apache2.pm extends @INC to include any (existing) Apache2/ directories so that use() and related statements work as intended. In our case, we need to use() Apache2 in order to ensure that, no matter how the end-user configured his mod_perl 2.0 installation, we can find the rest of the libraries we need.

Secure in the knowledge that our Makefile.PL will be able to find all our other mod_perl 2.0 packages (wherever they live), we can proceed. ModPerl::MM provides the WriteMakefile() function, which is similar to the ExtUtils::MakeMaker function of the same name and takes the same options. The reason that you will want to use the WriteMakefile() from ModPerl::MM is that, through means highly magical, all of your mod_perl-specific needs are satisfied. For instance, your module will be installed relative to Apache/ or Apache2/, depending on how mod_perl itself is installed. Other nice features are automatic inclusion of mod_perl's typemap and the header files required for XS-based modules, as well as magical cross-platform compatibility for Win32 compilation, which has been troublesome in the past.

Keep in mind that neither Apache2.pm nor ModPerl::MM are required in order to use Apache-Test - both are packages specific to mod_perl 2.0 and any handlers you may write for this version (as will be touched on later, Apache-Test can be used for mod_perl 1.0 based modules as well, or even Apache 1.3 or 2.0 modules independent of mod_perl, for that matter). The next package, Apache::TestMM, is where the real interface for Apache-Test begins.

Apache::TestMM, contains the functions we will need to configure the test harness. The first thing we do is import the test() and clean() functions, which generate their respective Makefile targets so that we can run (and re-run) our tests. After that, we call the filter_args() function. This allows us to configure various parts of our tests on the command line using different options, which will be discussed later.

The final part of our configuration uses the generate_script() method from the Apache::TestRunPerl class, which writes out the script responsible for running our tests, t/TEST. It is t/TEST that will be invoked when a user issues make test, although the script can be called directly as well. While t/TEST can end up containing lots of information, if you crack it open, then you would see that the engine that really drives the test suite is rather simple.

  
use Apache::TestRunPerl ();
Apache::TestRunPerl->new->run(@ARGV);
  

Believe it or not, the single call to run() does all intricate work of starting, configuring, and stopping Apache, as well as running the individual tests we (still) have yet to define.

Despite the long explanations, the net result of our activity thus far has been a few modifications to a typical Makefile.PL so that it reflects the needs of both our mod_perl 2.0 module and our forthcoming use of the Apache-Test infrastructure. Next, we need to configure Apache for the tests specific to the functionality in our handler.

Configuring Apache

Ordinarily, there are many things you need to stuff into httpd.conf in order to get the server responding to requests, only some of which are related to the content the server will provide. The Apache-Test framework provides a minimal Apache configuration, such as default DocumentRoot, ErrorLog, Listen, and other settings required for normal operation of the server. In fact, with no intervention on your part, Apache-Test provides a configuration that enables you to successfully request /index.html from the server. Chances are, though, that you will need something above a basic configuration in order to test your module appropriately.

To add additional settings to the defaults, we create the file t/conf/extra.conf.in, adding any required directories along the way. If Apache-Test sees extra.conf.in, then it would pull the file into its default configuration using an Include directive (after some manipulations we will discuss shortly). This provides a nice way of adding only the configuration data you require for your tests, and saves you from the need to worry about the mundane aspects of running the server.

One of the first aspects of Apache::Clean we should test is whether it can clean up a simple, static HTML file. So, we begin our extra.conf.in with the following:

  
PerlSwitches -w

Alias /level @DocumentRoot@
<Location /level>
  PerlOutputFilterHandler Apache::Clean
  PerlSetVar CleanLevel 2
</Location>
  

This activates our output filter for requests to /level. Note the introduction of a new directive, PerlSwitches, which allows you to pass command line switches to the embedded perl interpreter. Here, we use it to enable warnings, similar to the way that PerlWarn worked in mod_perl 1.0. PerlSwitches can actually take any perl command line switch, which makes it a fairly useful and flexible tool. For example, we could use the -I switch to extend @INC in place of adding use lib statements to a startup.pl, or use -T to enable taint mode in place of the former PerlTaintMode directive, which is not part of mod_perl 2.0.

Next, we come to the familiar Alias directive, albeit with a twist. As previously mentioned, Apache-Test configures several defaults, including DocumentRoot and ServerRoot. One of the nice features of Apache-Test is that it keeps track of its defaults for you and provides some helpful variable expansions. In my particular case, the @DocumentRoot@ variable in the Alias directive is replaced with the value of the default DocumentRoot that Apache-Test calculated for my build. The real configuration ends up looking like

  
Alias /level /src/perl.com/Apache-Clean-2.0/t/htdocs
  

when the tests are run. This is handy, especially when you take into consideration that your tests may run on different platforms.

The rest of the configuration closely resembles our example from last time - using the PerlOutputFilterHandler to specify Apache::Clean as our output filter, and PerlSetVar to specify the specific HTML::Clean level. The only thing missing before we have prepared our module enough to run our first test is some testable content in DocumentRoot.

As you can see from the @DocumentRoot@ expansion in the previous example, DocumentRoot resolves to ServerRoot/t/htdocs/, so that is one place where we can put any documents we are interested in retrieving for our tests. So, we create t/htdocs/index.html and place some useful content in it.

  
<i    ><strong>&quot;This is a test&quot;</strong></i   >
  

Our index.html contains a number of different elements that HTML::Clean can tidy, making it useful for testing various configurations of Apache::Clean.

Now we have all the Apache configuration that is required: some custom configuration directives in t/conf/extra.conf.in and some useful content in t/htdocs/index.html. All that is left to do is write the tests.

Writing the Tests

The Apache configuration we have created thus far provides a way to test Apache::Clean through /level/index.html. The result of this request should be that the default Apache content handler serves up index.html, applying our PerlOutputFilterHandler to the file before it is sent over the wire. Given the configured PerlSetVar CleanLevel 2 we would expect the end results of the request to be

  
<i><b>&quot;This is a test&quot;</b></i>
  

where tags are shortened and whitespace removed but the &quot; entity is left untouched. Well, maybe this is not what you would have expected, but cracking open the code for HTML::Clean reveals that level(2) includes the whitespace and shortertags options, but not the entities option. This brings us to the larger issue of test design and the possibility that flawed expectations can mask true bugs - when a test fails, is the bug in the test or in the code? - but that is a discussion for another time.

Given our configuration and expected results, we can craft a test that requests /level/index.html, isolates the content from the server response, then tests the content against our expectations. The file t/01level.t shown here does exactly that.

  
use strict;
use warnings FATAL => 'all';

use Apache::Test qw(plan ok have_lwp);
use Apache::TestRequest qw(GET);
use Apache::TestUtil qw(t_cmp);

plan tests => 1, have_lwp;

my $response = GET '/level/index.html';
chomp(my $content = $response->content);

ok ($content eq q!<i><b>&quot;This is a test&quot;</b></i>!);
  

t/01level.t illustrates a few of the things that will be common to most of the tests you will write. First, we do some bookkeeping and plan the number of tests that will be attempted using the plan() function from Apache::Test - in our case just one. The final, optional argument to plan() uses the have_lwp() function to check for the availability of the modules from the libwww-perl distribution. If have_lwp() returns true, then we know we can take advantage of the LWP shortcuts Apache::TestRequest provides. If have_lwp() returns false, then no tests are planned and the entire test is skipped at runtime.

After planning our test, we use the shortcut function GET() from Apache::TestRequest to issue a request to /level/index.html. GET() returns an HTTP::Response object, so if you are familiar with the LWP suite of modules you should feel right at home with what follows. Using the object in $response we isolate the server response using the content() method and compare it against our expected string. The comparison uses a call to ok(), which will report success if the two strings are equivalent.

Keep in mind that even though this example explicitly imported the plan(), ok(), have_lwp(), and GET() functions into our test script, that was just to illustrate the origins of the different parts of the test - each of these functions, along with just about all the others you may want, are exported by default. So, the typical test script will usually just call

  
use Apache::Test;
use Apache::TestRequest;
  

and go from there.

That is all there is to writing the test. In its simplest form, using Apache-Test involves pretty much the same steps as when writing tests using other Perl testing tools: plan() the number of tests in the script, do some stuff, and call ok() for each test you plan(). Apache-Test and its utility classes merely offer shortcuts that make writing tests against a running Apache server idiomatic.

Running the Tests

With all the preparation behind us - generating and customizing the Makefile.PL, configuring Apache with extra.conf.in, writing index.html and 01level.t - we have all the pieces in place and can (finally) run our test.

There are a few different ways we can run the tests in a distribution, but all require that we go through the standard build steps first.

  
$ perl Makefile.PL -apxs /usr/local/apache2/bin/apxs
Checking if your kit is complete ...
Looks good
Writing Makefile for Apache::Clean

$ make
cp Clean.pm blib/lib/Apache2/Apache/Clean.pm
Manifying blib/man3/Apache::Clean.3
  

Makefile.PL starts the process by generating the t/TEST script via the call to Apache::TestRunPerl->generate_script(). The additional argument we pass, -apxs, is trapped by Apache::TestMM::filter_args() and is used to specify the Apache installation we want to test our code against. Here, I use -apxs to specify the location of the apxs binary in my local Apache DSO installation - for static builds you will want to use -httpd to point to the httpd binary instead. By the time Makefile.PL exits, we have our test harness and know where our server lives.

Running make creates our build directory, blib/, and installs Clean.pm locally so we can use it in our tests. Note that ModPerl::MM installed Clean.pm relative to Apache2, magically following the path of my current mod_perl 2.0 installation.

At this point, we can run our tests. Issuing make test will run all the tests in t/, as you might expect. However, we can run our tests individually as well, which is particularly useful when debugging. To run a specific test we call t/TEST directly and give it the name of the test we are interested in.

  
$ t/TEST t/01level.t
*** setting ulimit to allow core files
ulimit -c unlimited; t/TEST 't/01level.t'
/usr/local/apache2/bin/httpd  -d /src/perl.com/Apache-Clean-2.0/t 
    -f /src/perl.com/Apache-Clean-2.0/t/conf/httpd.conf 
	-DAPACHE2 -DPERL_USEITHREADS
using Apache/2.1.0-dev (prefork MPM)

waiting for server to start: ..
waiting for server to start: ok (waited 1 secs)
server localhost:8529 started
01level....ok                                                                
All tests successful.
Files=1, Tests=1,  4 wallclock secs ( 3.15 cusr +  0.13 csys =  3.28 CPU)
*** server localhost:8529 shutdown
  

As you can see, the server was started, our test was run, the server was shutdown, and a report was generated - all with what is really minimal work on our part. Major kudos to the Apache-Test developers for making the development of live tests as easy as they are.

Beyond the Basics

What we have talked about so far is just the basics, and the framework is full of a number of different options designed to make writing and debugging tests easier. One of these is the Apache::TestUtil package, which provides a number of utility functions you can use in your tests. Probably the most helpful of these is t_cmp(), a simple equality testing function that also provides additional information when you run tests in verbose mode. For instance, after adding use Apache::TestUtil; to our 01level.t test, we can alter the call to ok() to look like

  
ok t_cmp(q!<i><b>&quot;This is a test&quot;</b></i>!, $content);
  

and the result would include expected and received notices (in addition to standard verbose output)

  
$ t/TEST t/01level.t -v
[lines snipped]
01level....1..1
# Running under perl version 5.009 for linux
# Current time local: Mon May  5 11:04:09 2003
# Current time GMT:   Mon May  5 15:04:09 2003
# Using Test.pm version 1.24
# expected: <i><b>&quot;This is a test&quot;</b></i>
# received: <i><b>&quot;This is a test&quot;</b></i>
ok 1
ok
All tests successful.
  

which is particularly helpful when debugging problems reported by end users of your code. See the Apache::TestUtil manpage for a long list of helper functions, as well as the README in the Apache-Test distribution for additional command line options over and above -v.

Of course, 01level.t only tests one aspect of our Clean.pm output filter, and there is much more functionality in the filter that we might want verify. So, let's take a quick look at some of the other tests that accompany the Apache::Clean distribution.

One of the features of Apache::Clean is that it automatically declines processing non-HTML documents. The logic for this was defined in just a few lines at the start of our filter.

  
# we only process HTML documents
unless ($r->content_type =~ m!text/html!i) {
  $log->info('skipping request to ', $r->uri, ' (not an HTML document)');

  return Apache::DECLINED;
}
  

A good test for this code would be verifying that content from a plain-text document does indeed pass through our filter unaltered, even if it has HTML tags that HTML::Clean would ordinary manipulate. Our test suite includes a file t/htdocs/index.txt whose content is identical to the index.html file we created earlier. Remembering that we already have an Apache configuration for /level that inserts our filter into the request cycle, we can use a request for /level/index.txt to test the decline logic.

  
use Apache::Test;
use Apache::TestRequest;

plan tests => 1, have_lwp;

my $response = GET '/level/index.txt';
chomp(my $content = $response->content);

ok ($content eq q!<i><strong>&quot;This is a test&quot;</strong></i>!);
  

It may be obvious, but if you think about what we are really testing here it is not that the content is unaltered - that is just what we use to measure the success of our test. The real test is against the criterion that determines whether the filter acts on the content. If we wanted to be really thorough, then we could add

  
AddDefaultCharset On
  

to our extra.conf.in to test the Content-Type logic against headers that look like text/html; charset=iso-8859-1 instead of just text/html. I actually have had more than one person comment that using a regular expression for testing the Content-Type is excessive - adding the AddDefaultCharset On directive shows that the regex logic can handle more runtime environments than a simple $r->content_type eq 'text/html' check. Oh, the bugs you will find, fix, and defend when you start writing tests.

More and More Tests

What other aspects of the filter can we put to the test? If you recall from our discussion of output filters last time, one of the responsibilities of filters that alter content is to remove the generated Content-Length header from the server response. The relevant code for this in our filter was as follows.

  
# output filters that alter content are responsible for removing
# the Content-Length header, but we only need to do this once.
$r->headers_out->unset('Content-Length');
  

Here is the test for this bit of logic, which checks that the Content-Length header is indeed present for plain documents, but removed by our filter for HTML documents. Again, we will be using the existing /level URI to request both index.txt and index.html.

  
use Apache::Test;
use Apache::TestRequest;

plan tests => 2, have_lwp;

my $response = GET '/level/index.txt';
ok ($response->content_length == 58);

$response = GET '/level/index.html';
ok (! $response->content_length);
  

Note the use of the content_length() method on our HTTP::Response object to retrieve the Content-Length of the server response. Remember that you have all the methods from that class to choose from in your tests.

The final test we will take a look at is the example we used previous time to illustrate our filter does indeed co-exist with both mod_include and mod_cgi. As it turns out, the example was taken right from the test suite (always a good place from which to draw examples). Here is the extra.conf.in snippet.

  
Alias /cgi-bin @ServerRoot@/cgi-bin
<Location /cgi-bin>
  SetHandler cgi-script

  SetOutputFilter INCLUDES
  PerlOutputFilterHandler Apache::Clean

  PerlSetVar CleanOption shortertags
  PerlAddVar CleanOption whitespace
  Options +ExecCGI +Includes
</Location>
  

The nature of our test requires that both mod_include and a suitable CGI platform (either mod_cgi or mod_cgid) be available to Apache - without both of these, our tests are doomed to failure, so we need a way to test whether these modules are available to the server before planning the individual tests. Also required are some CGI scripts, the location of which is specified by expanding @ServerRoot@. To include these scripts, we could just create a t/cgi-bin/ directory and place the relevant files in it. However, any CGI scripts we create would probably include a platform-specific shebang line like #!/usr/bin/perl. A better solution is to generate the scripts on-the-fly, specifying a shebang line that matches the version of Perl we are using to build and test the module.

Despite the extra work required, the test script used for this test is only a bit more complex than others we have seen so far.

  
use Apache::Test;
use Apache::TestRequest;
use Apache::TestUtil qw(t_write_perl_script);

use File::Spec::Functions qw(catfile);

plan tests => 4, (have_lwp && 
                  have_cgi &&
                  have_module('include'));

my @lines = <DATA>;
t_write_perl_script(catfile(qw(cgi-bin plain.cgi)), @lines[0,2]);
t_write_perl_script(catfile(qw(cgi-bin include.cgi)), @lines[1,2]);

my $response = GET '/cgi-bin/plain.cgi';
chomp(my $content = $response->content);

ok ($content eq q!<strong>/cgi-bin/plain.cgi</strong>!);
ok ($response->content_type =~ m!text/plain!);

$response = GET '/cgi-bin/include.cgi';
chomp($content = $response->content);

ok ($content eq q!<b>/cgi-bin/include.cgi</b>!);
ok ($response->content_type =~ m!text/html!);

__END__
print "Content-Type: text/plain\n\n";
print "Content-Type: text/html\n\n";
print '<strong><!--#echo var="DOCUMENT_URI" --></strong>';
  

The first thing to note is that we have joined the familiar call to have_lwp() with additional calls to have_cgi() and have_module(). The Apache::Test package comes with a number of handy shortcuts for querying the server for information. have_cgi() returns true if either mod_cgi or mod_cgid are installed. have_module() is more generic and can be used to test for either Apache C modules or Perl modules - for instance, have_module('Template') could be used to check whether the Template Toolkit is installed.

For generation of the CGI scripts, we use the t_write_perl_script() function from the Apache::TestUtil package. t_write_perl_script() takes two arguments, the first of which is the name of the file to generate, relative to the t/ directory in the distribution. If the file includes a path, any necessary directories are automatically created. In the interests of portability, we use catfile() from the File::Spec::Functions package to join the file with the directory. In general, you will want to keep File::Spec and its associated classes in mind when writing your tests - you never know when somebody is going to try and run them on Win32 or VMS. The second argument to t_write_perl_script() is a list of lines to append to the file after the (calculated) shebang line.

Although t_write_perl_script() cleans up any generated files and directories when the test completes, if we were to intercept include.cgi before removal it would look similar to something we would have written ourselves.

  
#!/src/bleedperl/bin/perl
# WARNING: this file is generated, do not edit
# 01: /src/bleedperl/lib/site_perl/5.9.0/i686-linux-thread-multi/
      Apache/TestUtil.pm:129
# 02: 06mod_cgi.t:18
print "Content-Type: text/html\n\n";
print '<strong><!--#echo var="DOCUMENT_URI" --></strong>';
  

As you probably have guessed by now, just as we ran tests against scripts in the (generated) t/cgi-bin/ directory, we can add other directories to t/ for other kinds of tests. For instance, we can create t/perl-bin/ to hold standard ModPerl::Registry scripts (remember, you don't need to generate a shebang line for those). We can even create t/My/ to hold a custom My::ContentGenerator handler, which can be used just like any other Perl module during Apache's runtime. All in all, you can simulate practically any production environment imaginable.

But Wait, There's More!

The tests presented here should be enough to get you started writing tests for your own modules, but they are only part of the story. If you are interested in seeing some of the other tests written to support this article, the Apache::Clean distribution is full of all kinds of different tests and test approaches, including some that integrate custom handlers as well as one that tests the POD syntax for the module. In fact, you will find 26 different tests in 12 test files there, free for the taking.

Stuck using mod_perl 1.0? One of the best things about Apache-Test is that it is flexible and intelligent enough to be used for mod_perl 1.0 handlers as well. In fact, the recent release of Apache-Test as a CPAN module outside of the mod_perl 2.0 distribution makes it even easier for all mod_perl developers to take advantage of the framework. For the most part, the instructions in this article should be enough to get you going writing tests for 1.0-based modules - the only changes specific to 1.0 modules rest in the Makefile.PL. I took the time to whip up a version of Apache::Clean for mod_perl 1.0 that parallels the functionality in these articles, which you can find next to the 2.0 version. The 1.0 distribution runs against the exact same *.t files (where applicable) and includes a sample 1.0 Makefile.PL.

Personally, I don't know how I ever got along without Apache-Test, and I'm sure that once you start using it you will feel the same. Secretly, I'm hoping that Apache-Test becomes so popular that end-users start wrapping their bug reports up in little, self-contained, Apache-Test-based tarballs so anyone can reproduce the problem.

More Information

This article was derived from Recipe 7.7 in the mod_perl Developer's Cookbook, adjusted to accommodate both mod_perl 2.0 and changes in the overall Apache-Test interface that have happened since publication. Despite these differences, the recipe is useful for its additional descriptions and coverage of features not discussed here. You can read Recipe 7.7, as well as the rest of Chapter 7 from the book's website. Also, in addition to the Apache-Test manpage and README there is also the Apache-Test tutorial on the mod_perl Project website, all of which are valuable sources of information.

Thanks

The Apache-Test project is the result of the tireless efforts of many, many developers - far too many to name individually here. However, there has has been a recent surge of activity as Apache-Test made its way to CPAN, especially in making it more platform aware and solving a few back compatibility problems with the old Apache::test that ships with mod_perl 1.0. Special thanks are due to Stas Bekman, David Wheeler, and Randy Kobes for helping to polish Apache-Test on Win32 and Mac OS X without requiring major changes to the API.

This week on Perl 6, week ending 2003-05-18

Welcome back to another Perl 6 summary, this week without any 'comic' introductions.

So, without further ado we press straight on to the happenings in perl6-internals.

Makefile issues

Bruce Gray sent in a patch to tweak Parrot's Makefile to ensure that IMCC could be built before doing either make test or make static. The various executables in the Parrot distribution now link against libparrot.a instead of just linking a bunch of .o files. Steve Fink liked the idea, and made a few other suggestions (building IMCC by default, having make test run the IMCC tests...). The patch was later applied by Bruce after Steve Fink gave him committer superpowers.

http://groups.google.com/groups

Even more on stack walking

The discussion of garbage collection and timely destruction came up again. Essentially, we want to have our cake and eat it too.

Here's some background. Consider the following piece of code:


    sub iterate_over_file {
        my($filename, $code) = @_;
        open my $fh, '<', $filename or die;
        &$code($_) for <$fh>;
    }

Unless I'm going mad, this should be legal code in both Perl 5 and Perl 6. Now, consider what what happens when execution leaves the iterate_over_file function. In Perl 5 the file handle in $fh has its reference count decremented and, because there are no remaining references, the file is closed and the filehandle is destroyed. In a Perl 6 with no 'special case' garbage collection added, nothing happens. $fh only gets closed when a Dead Object Detection (DOD) run is triggered, which need not happen for a while, potentially leading to resource leaks or locking issues or other unforseen consequences.

One way around this is for classes whose objects need timely destruction to tell Perl that there are some of its objects floating about so that Perl can trigger a DOD run at the end of every scope until there are no more such objects floating around. But that could be expensive (as Luke Palmer demonstrated with a neat piece of pathological code). Luke suggested a scheme where variables would have a needsDOD property (or some such thing) and a DOD would be triggered at the end of any scope that contained such a variable. Benjamin Goldberg offered a suggestion for a hybrid reference counting/full GC scheme.

Garrett Goebel pointed out that the details of how Perl 6 implements this are really irrelevant at the moment; Dan has said there will be a way to trigger a DOD run at any time, which means that Parrot should support whatever scheme the Perl 6 implementers come up with.

http://groups.google.com/groups

BASIC gets fancy

Clinton A. Pierce announced that the compiled version of BASIC under languages/BASIC/compiler now supports the QuickBASIC style COLOR, LOCATE and CLS statements. These additions prompted Leon Brocard to post a colour version of mandel.bas which generates a colour representation of the Mandelbrot set.

http://groups.google.com/groups

http://groups.google.com/groups

More sub/method call stuff

Dan announced that now we know what the calling conventions will be, we should look at how we should make calls to parrot functions. He outlined a continuation passing style approach which looks like it should have all the right magic.

Hmm... I just tried to write a short description of continuation passing style and ended up duplicating most of Dan's post, but not so well written, you're better off reading that.

The general response to this was positive (once people had got over boggling at the idea of data encapsulation in an 'assembler').

There was also some discussion of how continuations, tied variables and hypotheticals would interact (answer: We don't know yet, but it looks interesting...)

http://groups.google.com/groups

Indexing registers or something

Klaas-Jan Stol wondered about adding some way of doing register based indirect addressing (as Dan called it), which would allow for iterating over a set of registers in a loop (amongst other things). Dan suggested a couple of ways of doing it, but wanted a usage case for compiler-generated code before he went implementing anything. Klaas-Jan suggested a Lua construct that's analogous to the Perlish ($a, $b, $c) = (f(), g(), h()) as an example of where such code would come in handy. Luke Palmer suggested a way of implementing that in code that wouldn't need the indirect register addressing, and noted that if indirect register addressing were implemented then IMCC would probably get very confused indeed.

http://groups.google.com/groups

Socket IO

Andrew The has been working on getting Socket IO working in Parrot, implementing a thin layer over the BSD socket functions and he had a few questions about the official way to do some things. Leo Tötsch answered some questions, and suggested that Andrew liaise with Jürgen Bömmels, who is working on Parrot's IO layer.

http://groups.google.com/groups

Using vtable macros

Leo Tötsch offered a couple of scripts to convert a Parrot distribution to use the new VTABLE_* vtable macros (it's done as a script because some many people have their own collection of patches in place, so a simple CVS commit doesn't catch all the vtable accesses).

http://groups.google.com/groups

Disable unused vtable entries

Leo Tötsch proposed disabling all vtable methods that are either unused or not covered by opcodes. He reckons that this should make changes to the class hierarchy and vtable layout much simpler. Warnock's Dilemma currently applies.

http://groups.google.com/groups

String->number problems

Clint Pierce noticed that


    set S1, "Not really a number 2"
    set N0, S1
    print N0
    end

gets its string to number conversion wrong and outputs 2.00000 rather than the expected 0.00000. Luke Palmer patched it, but Benjamin Goldberg pointed out a few problems to do with string encodings.

http://groups.google.com/groups

Fixes to Parrot::Test

Bruce Gray tracked down and fixed some problems with make test seeing fake test failures, which turned out to arise from two interacting bugs in Parrot::Test

http://groups.google.com/groups

Switched run core

Leo Tötsch supplied a patch to add a switched prederefed run core to Parrot's menu of run core options. It's slightly faster than the plain prederefed run core, and Leo thinks it should be the default run core when the computed got and JIT cores are unavailable. Bruce Gray caught a problem with an embedded newline which broke a couple of the tinderboxes. As far as I know the patch has not yet been applied to the CVS distribution.

http://groups.google.com/groups

Meanwhile in perl6-language

The language list was quiet this week, with a grand total of 20 messages...

Contexts

Steve Fink had some questions about function signatures and how they interacted with pairs. The upshot of his question was the reminder that if you want to pass a literal pair into a function you need to remember to put it in parentheses func(1, (a => 7)) or it will be interpreted as a named parameter, which will throw an error if there isn't an appropriately named parameter or a slurpy hash param.

http://groups.google.com/groups

The object sigil

Luke Palmer wants a sigil to explicitly disambiguate between a container and the thing contained (short circuiting any language level delegation) and proposed & as the appropriate sigil. I'm not entirely sure that you need anything more than \ and some carefully deployed parentheses. I also appear to have missed something as there seemed to be an assumption that in code like this


    for 1..Inf -> $x { ... }

then $x wouldn't be a simple scalar, but an iterator, allowing you to write $x.next in the body of the loop, which I have to confess is news to me.

http://groups.google.com/groups

Acknowledgements, Announcements and Apologies

Thanks to everyone for everything. Particular thanks to whoever chose both my talks for YAPC this year; I really should get around to writing them soon. And fixing the massive bug in the module that one of the talks is all about...

If you've appreciated this summary, please consider one or more of the following options:

CGI::Kwiki

This article is about a new Perl module called CGI::Kwiki. With this module you can create a Wiki Web site in less than a minute. Now that's quick. Or more appropriately, ``That's Kwik!''

If you've not heard of a Wiki, it's a Web site that allows you to add and edit pages directly from your browser. Generally, every page on the site has a link or button that will let you edit the page that you are reading. When you edit a page, the raw contents of that page come up in a text edit area in your browser. You can make any changes you want. When you hit the SAVE button, the changes become live.

To create a new page, you just create a link on the current page to a page that didn't exist before. Then when you follow the new link, you are allowed to edit the new page.

Knowledge of HTML is not a prerequisite for using a Wiki. It's not even a requisite, because the raw Wiki contents that you edit are not presented as HTML. Wikis use a much more natural markup, that resembles the messages posted in Usenet news groups. An example can speak for itself:


    == A Page Header for a Sample Wiki Page ==

    Here's list of some WikiFormattingCodes:
    * Lines that begin '* ' form a bulleted list
    * Asterisks might be used to mean *bold* text
    * Links like http://www.perl.com work automatically

The only markup that should require further explanation is the text WikiFormattingCodes. Capitalized words that are mushed together form a link to another page on the Wiki site.

A Wiki is simply a Web site that is easy for ordinary people to edit. So where did the Wiki idea come from and why is it important?

Ward's Wiki Wisdom

I've only been dabbling in the world of Wiki for less than a year. Rather than answer that question myself, I decided to ask the inventor of the Wiki. Now by pure coincidence, Ward Cunningham lives but a few miles from my house and well with in my telephone area code. I decided to drop him a line, and find out his innermost feelings on his creation:

Brian: Yes, hello. May I speak to Mr. Ward Cunningham?

Ward: Who is this?

Brian: This is Brian Ingerson from Perl.com. I have a few questions.

Ward: Perl?! That's not me! Wall is to blame. Call him.

Brian: No. Wait. It's about the Wiki.

Ward: Ah, yes. The Wiki. Well let's get to business.

Brian: Why did you invent the Wiki?

Ward: Wiki had a predecessor that was a hyper-card stack. I wrote it to explore hypertext. I wanted to try recording something that was ragged, something that wouldn't fit into columns. I had this pet theory that programming ideas were spread by people working together. I set out to chart the flow of ideas through my company (then Tektronix). This turned out to be more fun than I ever would have imagined.

When we were really trying to capture a programmer's experience in software patterns, I remembered that stack and set out to do it over with the technology of the moment, the World Wide Web. This was 1994. I wrote Wiki to support and enlarge the community writing software patterns.

Brian: What do you see as Wiki's most-positive contribution to the world?

Ward: Back in 1994, the Web was a pretty wonderful place, with lots of people putting up stuff just because they thought someone else would find it interesting or useful. Wiki preserves that feeling in a place that has become too much of a shopping mall. It reminds people that sometimes to work together you have to trust each other more than you have any reason to.

Brian: Are you concerned that there are so many different Wiki implementations?

Ward: I was concerned once. I wish everyone used my markup instead of inventing their own. But that didn't happen. Now I realize that the implementations have done more to spread the idea than I ever could with my one version. That is the way it is with really simple things.

Brian: What programming language is your Wiki written in?

Ward: Um, ... Perl.

Brian: Tell me about that.

click

Wikis Wikis Everywhere

Just in case you didn't visualize the tongue entering and exiting my cheek, Ward does not have anything against Perl. To the contrary, he does almost all his open-source development with it, including his Wiki software. Try visiting his Wiki site, http://c2.com/cgi/wiki, for an excellent introduction to Wiki.

As was pointed out, there are many many implementations that have sprung forth since the Wiki was invented, and many of those were written in Perl. That's because a Wiki is fairly easy to implement and everyone seems to want to do it slightly differently.

Most of these implementations are just simple CGI scripts at heart. Even though they may have gathered dozens of special features over the years, they are really just ad hoc programs that are not particularly modularized or designed for extensibility.

One notable exception is the CGI::Wiki module by Kate ``Kake'' Pugh. This relatively new CPAN distribution is designed to be a Wiki framework. The various bits of functionality are encapsulated into class modules that can be extended by end users. As far as I know, this project is the first attempt in Perl to modularize the Wiki concept. It's about time!

The second attempt is a completely different module called CGI::Kwiki; the subject of this article. When I evaluated CGI::Wiki, I found it a little too heavy for my needs. It had about a dozen prerequisite modules and required an SQL database. CGI::Kwiki by comparison requires no extra modules besides those that come with Perl, and stores its Web pages as plain text files.

I find this preferable, because I can install a new Kwiki in seconds (literally) and I have the full arsenal of Unix commands at my disposal for manipulating the content. In fact, the default search facility for CGI::Kwiki is just a method call that invokes the Unix command grep.

Another compelling aspect of CGI::Kwiki is that every last bit of it is extensible, and extending it is trivial. About the only thing you can't easily change is the fact that it is written in Perl.

Because of this, I have probably set up more than a dozen Kwiki sites in the past month, and customized each one according to my needs. In this article, I'll show you how to do the same thing.

The Kwikest Way to Start

So just how easy is it to install a Kwiki? Well, that depends on how many of the basics you already have in place. You need a Web server and Perl, of course. You also need to have the CGI::Kwiki module installed from CPAN. That's about it.

For the sake of a specific example, let's say that you are running the Apache Web server (version 1.3.x) and that /home/johnny/public_html/cgi-bin/ is a CGI-enabled directory. With that setup in place, you can issue the following commands to create a new Kwiki:


    cd /home/johnny/public_html/cgi-bin/
    mkdir my-kwiki
    cd my-kwiki
    kwiki-install

Done! Your Kwiki is installed and ready for action. You should be able to point your Web browser at:


    http://your-domain/~johnny/cgi-bin/my-kwiki/index.cgi

and begin your wiki adventure.

At this point, if you do an ls command inside the my-kwiki directory, then you should see two files (index.cgi and config.yaml). index.cgi is just a point of execution for the CGI::Kwiki class modules, and config.yaml is little more than a list of which class modules are being used. You should also see a directory called database, where all your Kwiki pages are stored as individual plain text files.

These files will become important later as we explore how to customize Kwiki to your personal needs or whims.

If you are having trouble configuring Apache for CGI, then here is the basic httpd.conf section that I use for my personal Kwikis:


    Alias /kwiki/ /home/ingy/kwiki/
    <Directory /home/ingy/kwiki/>
        Order allow,deny
        Allow from all
        Options ExecCGI FollowSymLinks Indexes
        AddHandler cgi-script .cgi
        DirectoryIndex index.cgi
    </Directory>

This allows me to connect with this URL:


    http://localhost/kwiki/

Using Your Kwiki

When you first visit your newly installed Kwiki, you'll notice that there are a number of default pages already installed. Most notably is the one called HomePage, because that's the one you'll see first. This page requests that you change it as soon as possible. Go ahead and give it a try. Click the EDIT button.

You should see the text of HomePage laid out in Kwiki format inside an editable text area. Make some changes and click the SAVE button. The first thing you'll probably want to know is exactly how all the little Kwiki markup characters work.

KwikiFormattingRules

CGI::Kwiki has a set of default formatting rules that reflect my favorites from other Wikis. Some are from WardsWiki, some from MoinMoin, some from UseMod. All of them are customizable. More on that shortly. For now, let's go over the basics.

The first thing to learn is how to create a link. A link to another page on the site is made by squishing two or more words together in CamelCase. If the page doesn't exist yet, then that's OK. Clicking on it will allow you to create the new page from scratch. This is how Wikis grow.

You can also create an external link by simply starting some text with http:. Like http://c2.com/cgi/wiki, the original Wiki Web site. Sometimes you want an internal link that isn't CamelCase. Just put the link text inside square brackets. If you want the link to be external, then add the http: component inside the brackets:


    [check_this_out]
    [check this out http://checked.out]

The second most-common formatting rule I use is preformatted text. This is used for things like Perl code examples. Text that is preformatted is automatically immune to futher Wiki processing. To mark text as preformatted you just indent it. This is similar to the approach that POD takes:


        sub backwards_string {
            return join '', reverse split '', shift;
        }

One of the FormattingRules that I personally like is the ability to create HTML tables. You do it like this (if you're a bowler):


    | Player | 1   | 2   | 3   |
    | Marv   | 8-1 | X   | 9-/ |
    | Sally  | X   | X   | 8-1 |
    | Ingy   | 5-2 | 6-0 | 7-0 |
    | Big Al | 0-1 | 5-\ | X   |

(The people I bowl with usually get tired after three frames)

Tables are made by separating cells with vertical bar (or pipe) characters. Many times I need to put multiline text inside the cells. Kwiki accomplishes this by allowing a Here-Document style syntax:


    | yaml | perl | python |
    | <<end_yaml | <<end_perl | {'foo':'bar','bar':[42]} |
    ---
    foo: bar
    bar:
      - 42
    end_yaml
    {
      foo => 'bar',
      bar =>
        [ 42 ]
    }
    end_perl

Kwiki has a fairly rich set of default formatting rules. You'll find an exhaustive list of all the rules right inside your new Kwiki. The page is called KwikiFormattingRules. To find this page (and every other page on your Kwiki) click the RecentChanges link at the top of the current page.

KustomizingKwiki

To those of you familiar with the Wiki world, this has all been fairly pedestrian stuff so far. Here's where I think that things get interesting. As I stated before, every last part of the Kwiki software is changable, customizable and extensible. Best of all, it's easy to do.

CGI::Kwiki is made up of more than a dozen class modules. Each class is responsible for a specific piece of the overall Kwiki behavior. To change something about a particular class, you just subclass it with a module of your own.

Some of the more important CGI::Kwiki classes are:

Kwiki knows what classes to use by looking in it's config file. So if you want to subclass something, then the first thing you would do is change the config.yaml entry to point to your new class. Let's start with a easy one.

A Kwik and Dirty Tweak

Kwiki will turn a word or phrase inside *asterisks* to bold text. This is similar to the way you might do it in text e-mail. But WardsWiki uses '''triple quotes''' for bolding. Let's change your Kwiki to do it like Ward does.

First, create a file called MyFormatter.pm. You can put it right inside your Kwiki installation directory, and Kwiki will find it. The contents of the file should look like this:


    package MyFormatter;
    use base 'CGI::Kwiki::Formatter';

    sub bold {
        my ($self, $text) = @_;
        $text =~ s#'''(.*?)'''#<b>$1</b>#g;
        return $text;
    }

    1;

Now, change the config.yaml file to use this line:


    formatter_class: MyFormatter

The Kwiki formatting engine will now call your subroutine with pieces of text that are eligible to contain bold formatting. Sections of text that are already preformatted code will not be passed to your bold() method. And as you can see, MyFormatter is a subclass of CGI::Kwiki::Formatter so all the other formatting behaviors remain intact.

Kwiki's Formatting Engine

Let's look under the hood at CGI::Kwiki's hotrod formatting engine. You'll need to be familiar with it to do any serious formatting changes. Conceptually, it's rather simple. It works like this.

  • The text starts out as one big string.
  • There is a list of formatting routines that are applied in a certain order.
  • The string is passed to the first formatting routine. This routine may change the original text. It may also break the text into a number of substrings. It then return the strings it has created and manipulated.
  • Each of the substrings is run through the next formatting routine in line.
  • Sometimes, a formatting routine will want to make sure that no further routines touch a particular substring. It can do this by returning a hard reference to that string.
  • After all the substrings have been passed through every routine, they are joined back together to form one long string.

The specific routines and their order of execution is determined by another method called process_order(). The process_order method just returns a list of method names in the order they should be called. The default process_order method is:


    sub process_order {
        return qw(
            function
            table code header_1 header_2 header_3 
            escape_html
            lists comment horizontal_line
            paragraph 
            named_http_link no_http_link http_link
            no_wiki_link wiki_link force_wiki_link
            bold italic underscore
        );
    }

The best way to get a good feel for how to do things is to look over the CGI::Kwiki::Formatter module itself.

KontentKontrol

The biggest fear that many people have of setting up a Wiki site is that someone will come along and destroy all their pages. This happens from time to time, but in general people just don't do it. It's really not that cool of a trick to pull off. Someone could even write a program to destroy a Wiki, but if they were that smart, hopefully they'd be mature enough not to do it.

As of this writing, CGI::Kwiki doesn't do anything to protect your data. But remember, it's just code. Let's now extend your code to do a simple backup, everytime a page is written.

Possibly the simplest way to back up files on Unix is to use RCS. Let's make the Kwiki perform an RCS checkin every time it saves a page.

This time we need to extend the database class. Change the config file like so:


    database_class: MyDatabase

Then write a file called MyDatabase.pm that looks like:


    package MyDatabase;
    use base 'CGI::Kwiki::Database';

    sub store {
        my $self = shift;
        my ($file) = @_;
        $self->SUPER::store(@_);
        system(qq{ci -q -l -m"saved" database/$file backup/$file,v});
    }



    1;

Note: Be sure to add a backup directory that the CGI program can write to:


    mkdir backup
    chmod 777 backup

In this case the store method calls its parent method to handle this actual database store. But then it invokes an extra rcs command to backup the changes to the file.

Hopefully these examples will give you an idea of how to go about making other types of modifications to CGI::Kwiki. If you make a whole set of cohesive and generally useful extensions, then please consider putting them on CPAN as module distribution.

A Kwiki in Every Pot

The classic use for a Wiki site is to provide a multi-user forum for some topic of interest. In this context, Wiki is a great collaboration tool. People can add new ideas, and revise old ones. The Wiki serves as both an archive and a news site. Most Wikis provide a search mechanism and a RecentChanges facility.

But I think this only scratches the surface of Wiki usage possibilities. Since a Kwiki is so easy to create, I now find myself doing it all the time. It's almost like I'm creating a new wiki for every little thing I set out to do. Here's a few examples:

  • Personal Planning

    I have a personal wiki for keeping track of my projects. I keep it on my laptop.

  • Module Development

    Every Perl module I write these days has its own Kwiki in the directory. I use them mainly for creating Test::FIT testing tables. (See Test::FIT on CPAN). But I can also use it for project notes and documentation. Since I can extend the Kwiki, I can make it export the pages to POD if I want.

  • Autobiowiki

    I am seriously considering writing the stories of my life in a Wiki. If I can get others to to the same, then the Wikis could be linked using Ward's SisterSite mechanism. This would create one big story. (See http://c2.com/cgi/wiki)

  • Project Collaboration

    For my bigger projects I like to create a user community based around a Wiki. Using Test::FIT I can actually get my users to write failing tests for my projects. And they can help write documentation, report bugs, share recipes, etc. (See http://fit.freepan.org and http://yaml.freepan.org)

Conclusion

One final point of interest; this entire article was written in a Wiki format. I needed to submit it to my editor in POD format, which he in turn formatted into the HTML you are reading now. I accomplished this by simply using an extension of CGI::Kwiki::Formatter that produces POD instead of HTML!

NOTE: The raw content of this article along with the formatter program can be found at http://www.freepan.org/ingy/articles/kwiki/

Editor's note: http://www.kwiki.org has been created as the official kwiki home page.

About the Author

Brian Ingerson has been programming for more than 20 years, and hacking Perl for five of those. He is dedicated to improving the overall quality of scripting languages including Perl, Python and Ruby. He currently hails from Portland, Ore.; the very location of this year's O'Reilly Open Source Convention. How convenient!

This week on Perl 6, week ending 2003-05-11

PEPPERPOT 1: Ooh look! It's another Perl 6 summary!

PEPPERPOT 2: Well I never did! What's in it this week?

PP 1: Ooh, I don't know. Shall we have a look?

PP 2: Ooh yes.

Following a convoluted animation sequence involving surreality, fig leaves, classical statuary, and a cardboard cutout of an orange Leon Brocard, the camera pulls back to reveal a large bearded man, sat in a leather chair with a huge Apple PowerBook on his lap. PIERS looks from the screen to the camera and says

PIERS: I never wanted to be a summary writer. I wanted to be a lumberjack, leaping from tree to tree as they float down the mighty rivers of Briti... Ah... *ahem*.

PIERS comes to a stammering stop as he realises that quoting Python is so 20 years ago and that this year all the cool London.pm kids are quoting Buffy the Vampire Slayer instead.

Anyhoo.

As you may have gathered from the above introduction, it's been fairly quiet this week. However, design, development and discussion continues. As usual we'll start with perl6-internals.

Long option Processing

Towards the end of April, Luke Palmer had posted a patch implementing a long option processing subsystem for Parrot. After wondering if getopt_long was a standard function (it isn't), Steve Fink applied the patch.

http://groups.google.com/groups

Excessive memory usage?

Leo Tötsch responded to Peter Gibbs' question of the week before about Parrot's apparently excessive memory use for a simple program. Leo and Peter batted the issue back and forth a few times in the process of tracking down the source of the problem. I think they got to the root of the problem, but I'm not sure if they've worked out a fix just yet.

http://groups.google.com/groups

NCI and handling of generic buffers of stuff

Last week Clinton A. Pierce had wondered about allocating blocks of 'generic' memory within Parrot for use as targets for native functions (in his case for calling Win32 functions). Piers Cawley wondered if it wouldn't be possible to implement a scratch buffer PMC to manage such blocks of memory. Dan pointed out that this was exactly what the 'unmanagedstruct' PMC type was intended for, but Clint wasn't sure that this did quite what he wanted and asked for more advice.

Later in the week Clint announced that he'd got UnManagedStruct PMCs doing some useful work, including allocating arbitrarily sized buffers from parrot assembly. People were impressed (In fact, Dan went so far as to call Clint his hero in his 'Squawks of the Parrot' weblog). However, Leo Tötsch pointed out that a good deal of the work that Clint had done on UnManagedStruct probably belonged on ManagedStruct instead. Apparently Clint (and Dan) hadn't noticed that ManagedStruct existed. Hopefully now he's had it pointed out to him, things will become a little more logical.

http://groups.google.com/groups

http://groups.google.com/groups

http://www.sidhe.org/~dan/blog/archives/000186.html

Calling convention changes

After Dan's last set of changes to the Parrot calling conventions, Brent Dax wondered why we still had a user stack now that it was no longer being used for argument passing. Klaas-Jan Stol reckoned it was still useful and it would make targetting Parrot easier. Luke Palmer wondered if the user stack wouldn't make it harder to implement an exception system.

Interestingly, Dan seems to be toying with having Parrot use Continuation Passing Style function calls at some point in the future. Who knows, maybe the next final calling conventions will mandate CPS (I confess, I rather like the idea myself...)

http://groups.google.com/groups

http://www.sidhe.org/~dan/blog/archives/000187.html

Building Parrot on windows

Getting Parrot working well in a Win32 environment seems to have been a Parrot theme over the last few weeks. This week Cal Henderson popped up with some problems getting things to link. The problem was tracked down quite quickly, and Cal ended up offering to do regular windows builds for the tinderbox system, though it looks like that's not yet as easy as it could be. Apparently the standard Win32 make targets don't quite build enough. I'm sure someone's working on fixing this; getting Win32 into the tinderbox will be really handy.

http://groups.google.com/groups

IMCC vs. Parrot assembler

Zellyn Hunter wondered about how to choose between IMCC and the native parrot assembler. (Zellyn also presented me with a dilemma about which personal pronoun to use, which I solved by going and looking at his website. Unless 'Leah' turns out to be a chap's name I think I'm safe). He also asked about the state of the documentation within CVS, and wondered which documents were still valid and which (if any) had been caught out by the march of development, suggesting a small meta document with information on the validity of the various docs (which begs the question of how one knows that the metadoc is up to date.)

The answer to the first question was ``Well, it depends'', with the rider that, unless you enjoyed doing your own register allocation, IMCC looked like the saner choice. Jerome Quelin pointed out that the Ook! compiler targets PASM, which seems to be in agreement with the whole sane/unsane thing.

http://groups.google.com/groups

More on stack walking

David Robins read Dan's blog entry about stack walking and had a few questions about the details. He wanted to know what was done about false positives (ie, how do you know that something that looks like a pointer is a pointer), and what was done about object destruction. Dan replied that, in one of the Parrot GC system's few conservative choices, Parrot GC took the view that if something in the system stack looks like a pointer then it's safest to assume that it is a pointer and proceed on that basis. He also noted that, if a language required a specific order of object destruction then it was down to the language to handle that.

The usual spectre of timely closure of IO handles was raised again -- Perl 5 guarantees that objects get destroyed (and therefore handles get closed) as soon as an object goes out of scope. This is somewhat harder to guarantee in a system that doesn't use reference counting for garbage collection. Graham Barr commented that ``Anyone who depends on GC for the closing of file handles asks for all they get.'', which is certainly one way of approaching the issue, but it does rather fly in the face of the Perl 5 Way.

http://groups.google.com/groups

http://www.sidhe.org/~dan/blog/archives/000174.html -- Dan explains stack walking

PIO work

Jürgen Bömmels continued his sterling work on the Parrot IO system (PIO), fixing up the problems with buffering and double frees etc that were mentioned last week.

sysinfo op

Dan added a sysinfo op which allows a running program to at least get some information about the machine it's running on. Things are apparently somewhat spotty for various architectures that Dan didn't have immediate access to, but there's a framework in place now and patches are welcome.

http://groups.google.com/groups

Meanwhile in perl6-language

The language list was again busier than the internals list this week, but only just. Maybe we're all waiting with bated breath for the next exegesis. Or maybe everyone's exhausted. It can't last though.

bool Type

Discussion of the hypothetical/non-existent bool type continued. I tried really hard to care.

Coroutine calling convention

As Luke Palmer continued to argue that Coroutines were a useless/dangerous bit of syntactic sugar and we'd be better using explicit iterator objects. Meanwhile, others discussed the various options for calling semantics in the case where Coroutines continued to look like coroutines instead of iterators. Damian Conway proposed another form of Coroutine calling convention. Damian proposed that yield return the 'new' arguments with which you 'resumed' the coroutine. This gives the programmer the option to handle different arguments however she likes. Piers Cawley liked the idea and posted a few snippets of code for implementing these semantics using continuations. He also suggested that it might be handy to be able to declare 'Blocklike' subroutines that were 'return transparent'. I think the continuations scared people off because there has been no comment, even on the Blocklike subs idea.

http://groups.google.com/groups

http://groups.google.com/groups -- Damian's proposal

Include macro

Uri Guttman, may his name be forever mispronounced, responded to last week's summary (instead of the original thread, bad Uri!) by pointing out that the simple minded include macro referred to in that summary was inefficient and proposed an optimization. This turned into a discussion of whether file reads should be 'slurpy' by default (Michael Lazzaro thought they should be, almost everyone else disagreed with him).

Then Damian popped up with his proposed Grand Unification of HERE docs, POD, and general file slurping, which looks very cool, but unfortunately, the unary << operator that Damian proposes to hang all this off seems to clash with Larry's proposed use of <<list of words>> as a replacement for qw/.../. The idea was well received, and people started casting around for a sensible operator and adding other neat ideas to go with it. Damian won the 'evil, evil, evil' prize when, after <<< had been proposed as the file slurp operator, he came up with the idea of vectorizing it, which would lead to such delights as

    my($stdout, $stderr) = >><<<<< open 'grep foo * |';

Or, for the avoidance of headaches:

    my($stdout, $stderr) = >> <<< << open 'grep foo * |';

(I hate to think how one would write that in a POD C<...> escape).

This then turned into a discussion of whether <<...>> made sense as a replacement for qw<...>, especially given the perceived utility of Damian's proposal.

Towards the end of the thread, Michael Lazzaro withdrew his contention that slurpy reads should be the default.

http://groups.google.com/groups

http://groups.google.com/groups -- Conway's Unified Here Theory

Acknowledgements, Announcements and Apologies

The camera pulls back from observing a computer screen on which the words ``The camera pulls back from...'' are being typed and swings around to focus on the typist who looks up from the screen and addresses the camera.

PIERS: And so, by some extraordinary coincidence, fate, it seemed, had decided that you, my readers, should keep your appointment with the end of this summary. But what of the new technologies that had been spoken of?

AUDIENCE: Where *BLEEP*?

PIERS: In an empty house?

AUDIENCE: When *BLEEP*?

PIERS: In the middle of the night?

AUDIENCE: Best time!

PIERS: What diabolical plan had been shaped by Damian and Larry's crazed imaginations? What indeed? From what had gone before it was clear that this was going to be...

AUDIENCE: a picnic?

PIERS: no picnic.

AUDIENCE: Awwwww!

The lights dim, the screen fades to white and the following words scroll up the screens of browsers the world over:

If you've appreciated this summary, please consider one or more of the following options:

  • Send money to the Perl Foundation at http://donate.perl-foundation.org/ and help support the ongoing development of Perl.
  • Get involved in the Perl 6 process. The mailing lists are open to all. http://dev.perl.org/perl6/ and http://www.parrotcode.org/ are good starting points with links to the appropriate mailing lists.
  • Send feedback, flames, money, photographic and writing commissions, or a better framing device than terrible pastiches of Monty Python and the Rocky Horror Show to p6summarizer@bofh.org.uk. People suggesting an MST3K version of this summary can jolly well do it themselves.

This week on Perl 6, week ending 2003-05-04

Welcome my friends, to the show that never ends. Yes, another week, another Perl 6 Summary, chock full of condensed goodness, Leo Tötsch admiration and a smattering information about the design and development of Perl 6 and its target virtual machine, Parrot.

A quiet week this week. Even the hotbed of discussion that is perl6-language saw fewer than 100 messages. However, in accordance with tradition, I'll start with perl6-internals, which saw all of 47 messages this week, none of them from Leon Brocard.

External Data Interfaces draft PDD

Discussion of the External Data Interfaces PDD continues. Hopefully we'll see the first 'real' version soon.

PMC Keys

Alberto Simões asked for a good description of PMC Keys. No answer yet.

http://groups.google.com/groups

Long option processing

Luke Palmer sent a patch to do long option parsing. Again, Warnock's Dilemma applies.

http://groups.google.com/groups

Problem with readline

Will Coleda announced that he was dusting off his TCL project and found that it threw lots of bus errors. He tracked the problem down down to the readline op. Benjamin Goldberg realised that what was happening was that a filedescriptor was being used as a pointer to a FILE datastructure. Which is never going to be good. (This would never happen in a language with typed values. Well, it might, but the error would be caught before the bus error). No fix yet.

http://groups.google.com/groups

Read buffering in PIO

Possibly prompted by Leon Brocard's nudge the week before, Dan Sugalski took another look at Jürgen Bömmels' rejected patch to add read buffering to the Parrot IO subsystem. Apparently there's a problem with a double free introduced by the patch. After some discussion (including a contribution from Melvin Smith, PIO's original author) of how to address the issue, Jürgen submitted another patch.

http://groups.google.com/groups

Excessive memory usage?

Peter Gibbs 'showed off' a short piece of PASM code that managed to use some 60Mb of memory and to allocate 1.5 million headers. He thought this a little excessive. No comment yet.

http://groups.google.com/groups

Extending pop

Klaas-Jan Stol wondered if it would be useful to have a variant of pop which could pop multiple items off the runtime stack. Dan thought it as a good idea and added a bunch more useful tricks involving stack marks and things, and asked for volunteers to implement it. Nobody has explicitly stepped up to that particular plate...

http://groups.google.com/groups

Clint Pierce shows off his 'mad NCI skeelz'

Clinton A Pierce has been playing with NCI on Win32 and has now got his Parrot BASIC calling Win32 functions natively. ``Mua-hahahaha'', as he so eloquently put it. He's now looking for a generic way to allocate a generic memory area in PASM for use as the target of a function, which should allow him to make even more Win32 calls without having to write an adaptor library in C first.

http://groups.google.com/groups

Dan changes the calling conventions again.

Dan released the final set of calling conventions again. He thinks he won't have to do this again. Again. The changes are all in PDD03 (docs/pdds/pdd03_calling_conventions.pod in the CVS version of the Parrot distribution). The big change is that we no longer use the stack at all for passing parameters, we use an overflow array instead. The smaller change is that the PDD has been clarified somewhat.

http://groups.google.com/groups

And that about wraps it up for the internals mailing list. However, Dan has been publishing some handy stuff in a new ``What the heck is ...?'' series on his 'Squawks of the Parrot' website.

http://www.sidhe.org/~dan/blog/archives/000174.html - Walking the stack

http://www.sidhe.org/~dan/blog/archives/000178.html - Coroutines

Meanwhile, over in perl6-language

There was lots more about types. And some new stuff too...

``I thought there was no bool type?''

Towards the end of last week, Smylers queried the 'bool' entries in the latest draft of Michael Lazzaro's Type Conversion Matrix. The thing is, Larry has said that there won't be boolean type. Smylers was not the first person to make this comment, he probably won't be the last. The stock response is along the lines of ``There's no bool type, but there is bool context.'' but according to Synopsis 6, this appears not to be the case anymore.

Then it all got a bit weird. Nobody quite asked ``What is truth anyway?'' but it was touch and go for a while as people discussed what the value of a bool would be in a numeric context. (I had a good deal of sympathy for the view that there should be some kind of warning...). We eventually ended up in a discussion of multistate logic (which, it seems to me is a candidate for 'something you implement in a module' status.), who knows where we'll end up this week.

http://groups.google.com/groups

Headers

Paul Hodges wondered if it would be possible to write something in Perl 6 that worked in a similar way to the C Preprocessor's #include directive, allowing him to push a common set of use statements and other compile time declarations into a header file, which could then be shared by multiple bits of code. Various people suggested more or less tricky options, but Marco Baringer won the 'simplest thing that could possibly work' prize from your summarizer with:


    macro include ($file) {
      join '', <open $file>;
    }

which does the job very straightforwardly.

http://groups.google.com/groups

Type Conversion Matrix (Take 3)

Michael Lazzaro posted his third attempt at a type conversion matrix. This triggered some discussion on the difference between primitive (int, float, etc) and 'full' types (Int, Float). Michael worried that some of the discussions were bloating primitives leading him to wonder what would be the point of using them if that happened.

It's apparent from the discussions here that the distinction between storage and value types enables a good deal of magic, but that scares people too.

http://groups.google.com/groups

Property Inheritance

David Wheeler popped up with something of a head scratcher. He wanted to know whether, when a method is overridden in a subclass, the overridden method inherits the traits (he said properties, but he meant traits I think) of its SUPER method. Luke Palmer thought it probably depends on the trait, but his guess was that traits would be inherited. Which led David to wonder if that meant you could override traits in a child class.

http://groups.google.com/groups

OO-overhead

For some reason, a discussion on structuring OO code in Perl 5 has been running in perl6-language for a while. It's handy for spotting issues, but not exactly on topic. It is to be hoped that Perl 6 will lose a lot of Perl 5's function and method call overhead though.

Chaining postconditionals

Michael Lazzaro asked for the rationale behind disallowing stuff like:


    return if <expr> for <list>

Short answer: Because Larry said so.

Supporting evidence (from the thread that developed anyway):


    foo $_ if baz for @list unless quux.next() while 1;

or:


    if $X {...} if $Y

http://groups.google.com/groups

Coroutine calling convention

Luke Palmer kicked off some discussion of the various coroutine calling conventions that Dan had discussed on his Squawks of the Parrot website. Luke though that coroutines should return iterator objects instead of the yielded value. Which would mean they weren't actually coroutines...

http://groups.google.com/groups

http://www.sidhe.org/~dan/blog/archives/000178.html - Dan talks Coroutines


Acknowledgements, Announcements and Apologies

So, another summary draws to a close on a glorious May afternoon. Here's to the next one. Thanks to those people who sent the proof I asked for in the last summary. No thanks to the gits who outnumbered them by sending spam to the same address.

If you've appreciated this summary, please consider one or more of the following options:

2003 Perl Conferences

The season of Perl conferences is almost upon us! In fact, the first of the YAPCs for this year is in less than a week. So I thought this would be a good time to give a tour of the various conferences and see what's going on at each.

YAPC::Israel

The first of the conferences, starting on Sunday, May 11, is YAPC::Israel. If you were planning to attend this, then please note that that's one day earlier than originally announced. It'll be at Haifa University, at the Caesarea Rothschild Institute.

What's going on? Well, there'll be a variety of presentations in Hebrew and in English: Reuven Lerner will be teaching the five things every Perl programmer should know; Yevgeny Menaker, author of "Programming Perl in the .NET Environment," will be talking about .NET; and there's a strong X Windows theme to the conference, with a Tk tutorial from Ido Trivizki, Ran Eliam presenting a new UI toolkit, and Mikhael Goikhman talking about X11 window managers and Perl.

The usual collection of lightning talks will take place, and special guest Mark-Jason Dominus will be presenting a set of lightning talks all by himself. Mark is also keynoting with a retrospective of the Quiz Of The Week. After the conference, Mark will be giving three one-day training courses. (Not included in the conference.)

All this and more for 150 Shekels. Can't be bad.

YAPC::Canada

Hot on the heels of the Israel conference will be the Canadian YAPC, on May 15-16. This will be 85 Canadian Dollars and includes dinner, a barbeque and CD-ROM proceedings.

YAPC::Canada will be keynoted by Dick Hardt of ActiveState - Dick will be talking about the history of Perl on Windows.

There'll be both free and professional tutorials at this YAPC, with Peter Scott giving tuition on object-oriented Perl. The two free tutorials are Mick Villeneuve's "Yet Another Perl Tutorial" for first-time Perl programmers, and Steve Jenkins' "Introduction to Regular Expressions."

The presentations will be divided into a beginner and an advanced stream; highlights include talks on the Perl debugger, Class::DBI, event loops, extreme testing, and Perl date and time handling.

Perl Whirl

In June, some of us lucky individuals will be talking about Perl on a ship in the waters around Hawaii for the 2003 Perl Whirl. There'll be a strong Perl 6 theme, with four half-day courses on Perl 6. Randal Schwartz will be presenting his packages, objects, references and modules tutorial to complement the new sequel to Learning Perl. MJD will be there, too, giving his famous Red Flags tutorial, and a half-day tutorial on iterators and generators.

There'll be other talks, of course, although the program isn't yet finalized. I shall attend, talking about how you can use Perl 6 right now - in a manner of speaking. ... And, of course, there'll be Hawaii, the Wizards' Cocktail Party, and much more.

YAPC::NA

In mid-June comes the main Yet Another Perl Conference. This year, it's at Boca Raton, Fla., from June 16-18. The cost is $85.

Tutorials include Damian Conway's "Everyday Perl" and Tim Maher's "Minimal Perl," Randal's PROM tutorial again, and Abigail's half-day regex tutorial. Larry and Damian will present a two-hour slot on Perl 6 on the final day.

Other short talks that caught my eye were Helen Cook talking about video manipulation; Schuyler Erle on geocaching; and two talks from Piers Cawley, on refactoring and the Pixie object persistence framework.

OSCon/TPC 7

And then in July, where we have the main event, The Perl Conference, on the theme of "Embracing and Extending Proprietary Software."

Again, Perl 6 will figure extensively, with talks from Damian on "What's New in Perl 6," the new Perl6::Rules module, Dan Sugalski and Leon Brocard on Parrot, and Allison Randal explaining the Perl 6 design philosophy.

For those of us still using Perl 5, I would recommend Ask Bjoern Hansen and Robert Spier's explanation of the perl.org single-signon process, Andy Wardley talking about what's new in Template Toolkit 3, and the unmissable random talk by MJD.

Of course, there's more than just Perl things going on at OSCon - you can drop into sessions on Apache, MySQL and PostGreSQL, Python and Ruby, the XML track and the eclectic "Emerging Topics" track. There's also be keynotes from Weta Digital on animating the "Lord of the Rings" movies, Mitch Kapor on his open-source desktop applications project, and George Dyson on "Von Neumann's Universe."

It's all happening at the Portland Mariott, from July 7-11. Make sure you're there.

YAPC::Europe

Less than two weeks after TPC comes the final event of the conference season, the European YAPC, at the CNAM Conservatoire in Paris.

Details of speakers are not yet confirmed, but we know it'll be July 23-25, and it'll be vast quantities of fun.

And finally ...

In case you're overwhelmed by the amount of conferences on this year, I maintain an at-a-glance calendar at iCal Exchange; if you're a user of Apple's iCal, you can subscribe to the Perl Conferences calendar at http://www.icalx.com/public/PerlConf/Perl32Conferences.ics

I hope to see you at one or more of the above!

Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en