August 2005 Archives

This Week in Perl 6, August 17-23, 2005


All--

Welcome to another Monday summary, which hopefully provides some evidence that Mondays can get better. It always feels like writing summaries is an uphill battle, so perhaps I should switch to writing about Perl 6 Language first and Perl 6 Compilers last. Then it will be downhill--maybe next time.

Perl 6 Compilers

More Random Pictures

Autrijus posted links to two more images he had created. This time the images were not about Pugs, but were just kind of cute. He also provided an explanation of one when prompted.

Methods as Functions

Yiyi Hu noticed that it was impossible to use a method of one argument as a function. Autrijus offered Yiyi a commit bit, but also kindly posted the resolution to Perl 6 Language. Thanks, Autrijus!

Methods on Code Blocks

Yiyi Hu discovered that { a b c }.pairs would cause Pugs to spin out of control. Luke Palmer fixed it. Hopefully one of the two of them added a test.

Autrijus' Secret Sauce

Kris Bosland asked a few question I have been wondering about Autrijus' new graphics. Autrijus kindly provided the answers.

Documentation Attack

Kevin Tew has decided the best way for him to delurk is to update documentation for Pugs. Dudley Flanders and chromatic both provided support, information, and suggestions for him.

Neko VM

Nicolas Cannasse announced his release of a high-level, multi-language VM and wondered what others thought of creating a Perl 6-to-Neko compiler. Autrijus and Leo provided a few corrections and comments.

Parrot

The FAQ, She is Gone!

Amias Channer noticed that the FAQ on parrotcode.org was gone. While no one responded, the FAQ appears to be back.

Platform-Specific C Files

Stephen Hill wanted to know where to put a platform-specific C file to provide missing functionality. Leo provided a few friendly pointers.

TclArray.get

Amos Robinson provided an implementation of get for TclArray. Will gratefully applied the patch.

ICU Being Passed Up

Adrian Lambeck wondered if Configure.pl was passing up ICU. Warnock applied, so Adrian took matter into his own hands by providing a possible solution. Jerry Gay offered to take ownership of the problem if no ICU-enabled soul picked it up. There have been no progress reports since then, though.

Java on Parrot

Tim Bunce asked some preliminary questions about running Java with Parrot. I provide preliminary answers, and Nattfodd and Autrijus posted links to related work. The important question of what to call it remained unraised. I vote for "Jot."

gdbmhash.t Failures

Tim Bunce noticed that gdbmhash.t was failing with an unhelpful error message. Andy Dougherty provided a patch that made the error message slightly more helpful. Jerry Gay applied it.

BEGIN Blocks

Leo posted some thoughts and information about BEGIN blocks in Perl 6 and the @IMMEDIATE pragma in PIR. It involved creating constant PMCs and freezing them into the bytecode. Then he made it work.

Amber for Parrot

Citing chatter overheard on its intelligence networks, Parrot raised the terror alert to Amber, or maybe Roger Browne released version 0.2.3 of his "Eiffel-like" scripting language, Amber. I can never keep track of these things.

Tcl parray

Amos Robinson offered to provide an implementation of Tcl's parray, including tests. Will wanted to apply it, but the attachment did not come through.

Parrot Vs. Neko

Nicolas Cannasse wondered why Parrot performed so poorly on the fib benchmark. Leo explained that this benchmark stressed a currently unoptimized portion of Parrot (function calls). He also provided a few pointers on which benchmarks Parrot does well.

Using PMCs from C

Klaas-Jan Stol's Lua compiler uses only PMCs. Thus, he wanted to know how to access these PMCs from NCI functions. Leo provided an answer, but also suggested he look at the new calling conventions, which perform auto-conversion in both directions.

PMC for Reference Counting

Nicholas Clark posted a relatively full analysis of how to generalize the DOD's registration system for further reuse. He also asked for ideas about names. I think the whole thing looks good and that "AddrRegistry" is a good name. Perhaps that has too many vowels--"AddrRgstry" and sometimes "AddrRgstr" might work.

Perl 6 Language

Type Inferencing in Perl 5

Autrijus (while discussing type inference in Perl 6) recalled that there was a Google Summer of Code project on type inferencing in Perl 5. Gary Jackson, the summer coder, provide a more detailed description of his work.

+"hello"

Daniel Brockman wondered if +"hello" still evaluated to a NaN. Larry reasoned that it might, and then went on to speculate about the what the extra exception information would do when a Num gets jammed into a num.

Generic Classes

Autrijus found the frequent use of generic classes confusing, as he thought that only roles were type parameterizable. Larry explained that roles could be promoted to classes pretty easily, but that the distinction between them was still useful and meaningful.

GC API

David Formosa posted a revised GC API after the previous discussion. More discussion ensued.

Name Conflicts

Yiyi Hu wondered what would happen if he declared two lexicals with different declarators. Larry answered that it would be a compile-time error.

Parsing Numbers

Ingo Blechschmidt posted a list of different possible ways to write numbers, asking which were valid and which not. Many weighed in, including Larry.

Bindings and Routine Signatures

Luke Palmer noticed that implementing binding as anonymous subroutines and then binding existing variables created delimited continuations, and binding globals, full continuations. While interesting, Warnock applies.

Visibilty of $?SELF and $?CLASS

Stevan Little wondered what scopes will have $?SELF and $?CLASS available to them. Larry provided answers.

"Time to Take Her Home Her Dizzy Head is Conscience-Laden"

Amusingly enough, the thread about time has a big gap between July 5 and August 15. The thread also reminded me why I have an analog watch.

Is Params::Validate Necessary?

Dave Rolsky hoped that Params::Validate would no longer be necessary in Perl 6. This led to much discussion of the parameter declaration syntax in Perl 6 and a few suggested changes.

Constants are Dead; Long Live Read-Only!

Apparently is constant is gone and is readonly is here. The discussion contains mores than that, but that is my take-away point.

Multidimensional Hyper Ops

Luke Palmer wondered how hyper ops would work on multi-dimensional inputs. The short answer is "recurse when possible, apply when not."

Serializing Code

Yuval Kogman posted an analysis of a new HTML::Prototype module that hinges on serializing code between the various layers of implementation. Many folks thought this was cool and discussion ensued.

Slurpy Hash

Luke Palmer wondered if one could bind a slurpy hash by name. The answer is no.

Making Pairs Less Magical

Luke Palmer wants pairs to be less magical, as their special treatment has caused much confusion of late. Much discussion continues.

Lazy Scalars?

Yiyi Hu wants lazily evaluated scalars. Ingo Blechschmidt, Luke Palmer, and Larry all provided ways to achieve that end. The simplest solution is to create an anonymous closure, it seems.

Using Foreign Languages

Ingo Blechschmidt wondered how to use identifiers from other languages that do not have compatible identifiers. Yuval reasoned that it would be dangerous to try to accommodate them too closely. Perhaps something like the Sinhala "karenawa," which marks the preceding word as being foreign (English specifically), will work?

Symbolic References

Ingo Blechschmidt wondered to use symbolic references of magic variables such as $?SELF. Larry provided a few answers.

The Usual Footer

To post to any of these mailing lists please subscribe by sending email to perl6-internals-subscribe@perl.org, perl6-language-subscribe@perl.org, or perl6-compiler-subscribe@perl.org. If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send feedback to

Perl Needs Better Tools


Perl is in danger of becoming a fading language--new programmers are learning Java and Python in college, and companies like Google hardly use Perl at all. If you are afraid that Perl may be in danger of becoming irrelevant for medium-to-large projects, then read on.

The Scary Part

I have discussed the future of Perl with managers from companies that currently use it and find that they worry about the future of Perl. One company I spoke with here in San Francisco is rewriting their core application in Java. Another company worries they will not be able to find new Perl programmers down the road. Yet another uses Perl for major projects, but suffers from difficulty in refactoring their extensive code base.

There are many reasons why companies care about the future of Perl. I offer a part of a solution: better tools for Perl can be a major part of keeping Perl relevant and effective as the primary language for medium and large projects.

When measuring the effectiveness of a development environment (people, language, tools, processes, etc.), a key measure is how expensive and painful it is to make changes to existing code. Once a project or system has grown to thousands of lines of code in dozens (or hundreds) of modules, the cost of making changes can escalate to the point where the team is afraid to make any significant change. Excellent tools are one of the ways to avoid this unhappy situation, or at least reduce its impact. Other factors are excellent processes and, of course, excellent people.

21st-Century Integrated Development Environments for Perl

I propose that more, high-quality development tools will help keep Perl relevant and alive in medium and large project environments. My focus in this article is on IDEs, or Integrated Development Environments, and primarily those with a graphical interface.

An IDE is an integrated set of tools for programming, combining a source code editor with a variety of other tools into a single package. Common features of modern IDEs include refactoring support, version control, real-time syntax checking, and auto-completion of code while typing.

I want to make it clear right at the outset that a team of highly skilled Perl programmers, using only tools that have been around for years (such as emacs, vi, cvs, and make) can and do build large, sophisticated, and successful projects. I am not worried about those programmers. I am worried about the larger population of programmers with one to five years of experience, and those who have not yet begun to program: the next generation of Perl programmers.

Great tools will not make a bad programmer into a good programmer, but they will certainly make a good programmer better. Unfortunately, the tools for Perl are years behind what is available for other languages, particularly Java.

One powerful example is the lack of graphical IDEs for Perl with excellent support for refactoring. Several IDEs for Java have extensive refactoring support. Only one for Perl, the EPIC plugin for Eclipse, supports even a single refactoring action.

For an example of how good IDEs have inspired at least one Perl developer, see Adam Kennedy's Perl.com article on his new PPI module and Scott Sotka's Devel::Refactor module (used in EPIC).

I acknowledge that a graphical IDE is not the be-all of good tools. Just as some writers reject word processors in favor of typewriters or hand-written manuscripts, some programmers reject graphical IDEs and would refuse a job that required them to use one. Not everyone has (nor should have) the same tool set, and there are things a pencil can do that vi and emacs will never do. That said, IDEs have wide use in businesses doing larger projects, and for many programmers and teams they provide major increases in productivity.

Another important point is that while this article discusses over a dozen specific tools or features, having all the tools in a single package produces the biggest value. An IDE that provides all of these features in a single package that people can easily install, easily extend, and easily maintain across an entire development team has far more value than the sum of its parts.

There is a big win when the features provided by an IDE immediately upon installation include all or almost all of the tools and features discussed here and where the features "know" about each other. For example, it is good if you enter the name of a non-existent subroutine and the real-time syntax checker catches this. It is much better if the code-assist feature then pops up a context menu offering to create a stub for the subroutine or to correct the name to that of an existing similar subroutine or method from another class that is available to the current file. (This is standard behavior for some Java IDEs.)

What Would a 21st-Century Perl Tool Set Contain?

Perl needs a few great IDEs--not just one, but more than one so that people have a diverse set to choose from. Perl deserves and needs a few great IDEs that lead the pack and set the standard for IDEs in other languages.

I am well aware that the dynamic nature of Perl makes it harder to have a program that can read and understand a Perl program, especially a large and complex one, but the difficulty in comprehending a Perl program makes the value of such a tool all the greater, and I have faith that the Perl community can overcome some of the built-in challenges of Perl. Indeed, it is among the greatest strengths of Perl that Perl users can adapt the language to their needs.

A great Perl IDE will contain at least the following, plus other features I haven't thought of. (And, of course, there must be many of those!)

Most of the screen shot examples in this article use the EPIC Perl IDE. At present, it has the largest amount of the features on my list (although it certainly doesn't have all of them).

Syntax-Coloring Text Editor

Most of you have probably seen this. It is available under vim, emacs, BBEdit, and TextPad. Just about every decent text editor will colorize source code so that keywords, operators, variables, etc., each have their own color, making it easier to spot syntax errors such as forgetting to close a quote pair.

Real-Time Syntax Checking

real-time syntax check example
Figure 1. Real-time syntax checking

The IDE in Figure 1 shows that line 4 has an error because of the missing ) and that line 5 has an error because there is no declaration of $naame (and use strict is in effect).

A key point here is that the IDE shows these errors right away, before you save and compile the code. (In this example, the EPIC IDE lets you specify how often to run the syntax check, from 0.01 to 10.00 seconds of idle time, or only on demand.)

As nice as this is, it would be even better if the IDE also offered ways to fix the problem, for example, offering to change $naame to $name. Figure 2 shows an IDE that does exactly that; unfortunately, for Java, not Perl.

syntax help from the IDE
Figure 2. Syntax help from the IDE

It would be great if Perl IDEs offered this kind of help.

Version Control Integration

All non-insane large projects use version control software. The most common version control software systems are probably CVS, Perforce, Subversion, and Visual SourceSafe. Figure 3 shows an IDE comparing the local version of a file to an older version from the CVS repository.

Figure 3
Figure 3. Comparing a local file to an older version in CVS--click image for full-size screen shot

CVS integration is available in many modern code editors, including emacs, vim, and BBEdit, as well as graphical IDEs such as Eclipse and Komodo Pro. Subversion integration is available as a plugin for Eclipse; Komodo Pro supports Perforce and Subversion.

A Code-Assist Editor

Suppose that you have just typed in an object reference and want to call a method on the object, but you are not sure what the method name is. Wouldn't it be nice if the editor popped up a menu listing all of the methods available for that object? It might look something like Figure 4.

automatic code completion
Figure 4. Automatic code completion

In this example, the IDE is able to figure out which class the object $q is an instance of and lists the names of the available methods. If you type a p, then the list shows only the method names beginning with p. If you type pa, then the list shows only the param() and parse_params() methods.

Excellent Refactoring Support

The easier it is to do refactoring, the more often people will do it. The following list contains the most common refactorings. Your personal list will probably be a little different. All of these are things you can do "manually," but the idea is to make them into one or two-click operations so that you will do them much more often. (For a extensive list of refactoring operations, see Martin Fowler's alphabetical list of refactorings.)

Extract Subroutine/Method

The IDE should create a new subroutine using the selected code and replace the selected code with a call to the new subroutine, with the proper parameters. Here's an example of using the Extract Subroutine refactoring from Eclipse/EPIC (which uses the Devel::Refactor module).

First, you select a chunk of code to turn into a new subroutine, and then select Extract Subroutine from a context menu. You then get the a dialog box asking for the name of the new subroutine (shown in Figure 5).

code before Extract
Subroutine refactoring
Figure 5. Code before Extract Subroutine refactoring

The IDE replaces the selected code with a call to the new subroutine, making reasonable guesses about the parameters and return values (Figure 6). You may need to clean up the result manually.

code after Extract
Subroutine refactoring
Figure 6. Code after Extract Subroutine

Figure 7 shows the new subroutine created by the IDE. In this case, it needs no changes, but sometimes you will need to adjust the parameters and/or return value(s).

the new subroutine
created by Extract Subroutine
Figure 7. The new subroutine created by Extract Subroutine

Ideally, the editor should prompt you to replace similar chunks of code with calls to the new subroutine.

Rename Subroutine/Method

The IDE should find all the calls to the subroutine throughout your project and offer to change them for you. You should be able to see a preview of all of the places a change could occur, and to accept or reject each one on a case-by-case basis. The action should be undoable.

Rename Variable

Like Rename Subroutine, this feature should find all occurrences throughout the project and offer to make the changes for you.

Change Subroutine/Method Signature

The IDE should be able to make reasonable guesses about whether each subroutine or method call is supplying the proper parameters. Partly this is to enable the real-time syntax checking mentioned above, and partly this is to enable you to select a subroutine declaration and tell the IDE you want to refactor it by adding or removing a parameter. The IDE should then prompt you for the change(s) you want to make, do its best to find all of the existing calls to the subroutine, and offer to correct the subroutine calls to supply the new parameters.

Obviously, this is an especially tricky thing to do in Perl, where subroutines fish their parameters out of @_. So the IDE would have to look carefully at how the code uses shift, @_, and $_[] in order to have a reasonable guess about the parameters the subroutine is expecting. In many common cases, though, a Perl IDE could make a reasonable guess about the parameters, such as in the following two examples, so that if you added or removed one, it could immediately prompt you about making corrections throughout the project:

sub doSomething {
    my $gender = shift;
    my $age    = shift;
    # Not too terribly hard to guess that $gender and $age are params
}

sub anotherThing {
    my ($speed,$direction) = @_;
    # No magic needed to guess $speed and $direction are params.
}
Move Subroutine/Method

This refactoring operation should give you a list or dialog box to choose the destination file in your project. The IDE should allow you to preview all of the changes that it would make to accomplish the move, which will include updating a call to the subroutine/method to use the proper class. At a minimum, the IDE should show you or list all of the calls to the subroutine so you can make the appropriate changes yourself. Ideally, the IDE should make a guess about possible destinations; for example, if $self is a parameter to the method being moved, then the IDE might try assuming the method is an object (instance) method and initially only list destination classes that inherit from the source class, or from which the source class inherits.

Change a Package Name

As with Rename Subroutine and Rename Variable, when changing a package name, the IDE should offer to update all existing references throughout your project.

Tree View and Navigation of Source Files and Resources

Another useful feature of good IDEs is being able to view all of the code for a project, or multiple projects, in a tree format, where you can "fold" and "unfold" the contents of folders. All of the modern graphical IDEs support this, even with multiple projects in different languages.

Being able to view your project in this manner gives you both a high-level overview and the ability to drill down into specific files, and to mix levels of detail by having some folders show their contents and some not.

For example, Figure 8 shows a partial screen shot from ActiveState's Komodo IDE.

tree view of code in Komodo
Figure 8. Tree view of code in Komodo

Support for Creating and Running Unit Tests

Anyone who has installed Perl modules from CPAN has seen unit tests--these are the various, often copious, tests that run when you execute the make test part of the installation process. The vast majority of CPAN modules include a suite of tests, often using the Test::Harness and/or Test::More modules. A good IDE will make it very easy to both create and run unit tests as you develop your project.

The most basic form of support for unit tests in an IDE is simply to make it easy to execute arbitrary scripts from within the IDE. Create a test.pl for your project and keep adding tests to it or to a t/ subdirectory as you develop, and keep running the script as you make changes. All modern IDEs provide at least this minimal capability.

A more sophisticated level of support for unit tests might resemble the Java IDE feature for tests written in JUnit, where you can select an existing class file (a .pm file in Perl) and ask the IDE to create a set of stub tests for every subroutine in the file. (See JUnit and the Perl module Test::Unit for more on unit tests.) Furthermore, the IDE should support running a set of tests and giving simple visual feedback on what passed/failed. The standard approach in the JUnit world is to show either a "green bar" (all passed) or "red bar" (something failed) and then allow you to see details on failures. Other nice-to-have features include calculating code-coverage, providing statistical summaries of tests, etc.

Figure 9 shows a successful run of a Java test suite with Eclipse.

JUnit test run, success
Figure 9. A successful JUnit test run

Figure 10 shows the same test run, this time with a failure.

JUnit test run, with a failure.
Figure 10. A JUnit test run with a failure

A stack trace of the failure message appears in another part of the window (cropped out here to save space). If you double-click on the test that failed (testInflate), the IDE will open the file (BalloonTest, in this case) and navigate to the test function.

The central idea is that the IDE should make it as painless as possible to add and modify and run tests, so you will do more of it during development.

Language-Specific Help

This is a fairly straightforward idea--the IDE should be able to find and display the appropriate documentation for any keyword in your code, so if you highlight push and ask for help, you should see the push entry from the Perl documentation. If you highlight a method or subroutine or other symbol name from an imported module, the IDE should display the module's documentation for the selected item. Of course, this requires that the documentation be available in a consistent, machine-readable form, which is only sometimes true.

Debugger with Real-Time Display of Results

All modern IDEs offer support for running your code under a debugger, usually with visual display of what's going on, including the state of variables. The Komodo IDE supports debugging Perl that is running either locally or remotely.

Typical support for debugging in an IDE includes the ability to set breakpoints, monitor the state of variables, etc. Basically, the IDE should provide support for all of the features of the debugger itself. Graphical IDEs should provide a visual display of what is going on.

Automatic Code Reformatting

This means automatically or on-demand re-indenting and other reformatting of code. For example, when you cut and paste a chunk of code, the IDE should support reformatting the chunk to match the indentation of its new location. If you change the number of spaces or tabs for each level of indentation, or your convention for the placement of curly braces, then the IDE should support adjusting an entire file or all files in your project.

Seamless Handling of Multiple Languages

Many large software projects involve multiple languages. This is almost universally true in the case of web applications, where the user interface typically uses HTML, CSS, and JavaScript, and the back end uses one or more of Perl, PHP, Java, Python, Ruby, etc. It is very helpful to have development tools that seamlessly integrate work done in all of the languages. This is becoming quite common. For example, both Komodo and Eclipse support multiple languages.

Automated Building and Testing

This feature can be very basic by making it easy to run an arbitrary script from within the IDE and to see its output. This could be as simple as having the IDE configured to have a one-click way of running the traditional Perl module build-and-test commands:

$ perl Makefile.PL
$ make
$ make test

A more advanced version of this feature might involve having the IDE create stub code to test all of the subroutines in an existing file, or to run all of the scripts in a specified directory under Test::Harness, or to run a set of tests using Test::Unit::TestRunner or Test::Unit::TkTestRunner. (The latter provides a GUI testing framework.)

Conclusion and Recommendations

While there are many tools for helping Perl development, the current state of the Perl toolbox is still years behind those of other languages--perhaps three to five years behind, when compared to Java tools. While there are several tools for Java that have all the features described above, virtually none for Perl have all of them. On the other hand, things are looking up; they are better now than a year ago. It's possible to close that gap in a year or two.

A couple of obvious areas where improvements could be somewhat easy are adding more features to EPIC and Komodo. EPIC is open source, so there is potentially a wider pool of talent that could contribute. On the other hand, Komodo has a company with money behind it, so people actually get paid to improve it. Hopefully both tools will get better with time.

Another interesting possibility is the development of new IDEs or adding to existing ones by using Adam Kennedy's PPI module, which provides the ability to parse Perl documents into a reasonable abstract syntax tree and to manipulate the elements and re-compose the document. There is a new Perl editor project, VIP, that is in the design stages and is intended to be "pluggable" and to have special features to support pair programming.

Finally, I've gathered a couple of lists of links for related material. The first list below consist of IDEs and graphical editors for Perl, and the second list consists of various related articles and websites. I hope this is all inspirational and helpful.

Current IDEs for Perl

The listed IDEs support Perl. The list is undoubtedly incomplete, but should form a good starting point for anyone wishing to look into this further.

  • Affrus

    Perl only, Mac OS X only. Closed source (and hence not extensible by users). Primarily designed for CGI and standalone scripts. Free demo available. $99 to purchase. (See the Perl.com review of Affrus to learn more.)

  • Eclipse/EPIC

    EPIC is a plugin for the Eclipse platform. Eclipse is open-source and cross platform (Windows/Mac/Linux/Solaris, etc.). Once you have Eclipse installed, install the EPIC plugin from within the Eclipse application using the EPIC update URL. Eclipse supports Java, and with plugins, C/C++, COBOL, Perl, PHP, UML2, Python, Ruby, XML, and more. There is a large and active community around Eclipse.

  • Emacs is the mother of all text-editor/development-environment/adventure-game/all-in-one tools. Expert programmers use it widely and there are numerous enhancements for working with particular languages, including, of course, Perl. Emacs, with CPerlMode, is a richly featured IDE for Perl, albeit a non-GUI IDE (which, for some people, makes it even better). A set of extensions for CPerlMode are available but you need to join the Yahoo Extreme Perl group to get to them.
  • Komodo

    This runs on Linux, Solaris, and Windows. Free demo; $29.95 for personal and student use, $295 for commercial use. It supports Perl, PHP, Python, Tcl, and XSLT.

  • PAGE

    PAGE runs only on Windows (9x/ME/NT/2000/XP). It is a Rapid Application Development tool for Perl and comes in three versions: Free, Standard ($10), and Enterprise ($50). PAGE provides a several "wizards" for creating scripts, modules (packages), web forms, and even database applications.

  • Perl Editor

    This closed source program runs only on Windows (9x/NT/2000/XP). It has a GUI code profiler, and the Pro version has a regular expression tester and built-in web server (for CGI testing, etc.). Perl Editor claims to have the best debugger on the market. It also comes with GUI tools for managing MySQL databases. $69.95 to purchase.

  • vim

    The well-known descendent of vi is a powerful and flexible text editor with many plugins and extensions. Have a look at the vim scripts ; for example, vim.sourceforge.net/scripts/script.php?script_id=556 and vim.sourceforge.net/scripts/script.php?script_id=281.

  • visiPerl

    This is a closed source application that runs on Win9x/NT/2000. It handles Perl and HTML and has code templates, being designed for website building. visiPerl includes a built-in web server for testing and an FTP client for code deployment. There is a free demo, or you can purchase it for $59.

Related Topics

This Week in Perl 6, Through August 14, 2005


As you will note from the date in the title, it's been a short week. We're switching back to a midnight Sunday/Monday rollover in order to make life easier for the Perl.com types. So, if I can avoid being distracted too much by the second Ashes test, I'll try to get the summary finished before Monday is out, which should please chromatic.

This Week in perl6-compiler

Another low-volume week in perl6-compiler; probably because, with the high speed of Pugs development, most of the discussion happens on IRC.

Container Model, Pictures, and Questions

Autrijus fielded some questions about, and updated the pictures of, the container model.

Why PXPerl?

Robert (No Surname) asked what were the benefits of PXPerl over the ActiveState distribution. In short, PXPerl comes with Parrot and Pugs, which ActiveState doesn't. If you set your path appropriately, you can continue to use the ActiveState Perl and just rely on PXPerl for Parrot and Pugs.

Hoisting Lexical Declarations

Larry answered some of Autrijus's questions about Perl 6's lexical scoping rules. Apparently what Pugs currently does is close enough to sane to be going on with.

Warnock in Pugsland

Autrijus noted that, in Pugsland, a Warnocked patch usually means that the person who posted the patch simply received a committer bit and didn't mention the fact on the list.

Metamodel Notes

Nathan Gray posted some notes and ASCII art about the metamodel. Autrijus added pointers to further pictures.

Meanwhile, in perl6-internals

Updated intro.pod

Jonathan Worthington posted a rewrite of Parrot's intro.pod document, now with a discussion of PIR. Huzzah!

Test::Builder and Friends on Parrot

Following prompting from Geoff Young and Jeff Horwitz, chromatic has implemented Test::Builder and Test::Builder::Tester in pure Parrot. For his next trick, he intends to port Test::More and Parrot::Test.

Tests are good, m'kay?

How to Add a New Opcode?

Gerd Pokorra asked how to add an opcode to Parrot. Klaas-Jan Stol and Leo gave the answers.

Cleaning Up the Call Opcodes

Leo reposted about cleaning up the various function-calling opcodes to take account of the fact that the calling conventions have changed. He asked for opinions and actually received a couple, which is handy, since he ended up Warnocked last time.

parrot -I

Amir Karger wondered if there was some way of telling Parrot to add directories to its load path. Leo seemed to think it was not that good an idea, and proposed using a relative path in a .include directive.

Dominance Frontiers

Curtis Rawls continued his work on dominance frontiers to improve Parrot's optimizer.

PGE Globber, Empty Strings

Will Coleda reported on trying to match empty strings with PGE's glob implementation. It turned out to be a problem with Data::Escape. Leo fixed it.

Deprecated Opcodes

Leo posted a list of opcodes that are due for the chop (or alteration) soon. If you're doing anything with Parrot, it's probably a good idea to take a look at this list. One of those who did was chromatic, who asked if Leo could give some examples of translating code so as not to use the old forms.

Meanwhile, in perl6-language

Hmm. Eight balls to go with one wicket needed. I think I'll pause for a while.

Damn. Australia have saved the game.

Translating (Or at Least Parsing) Java Interface Definitions

Tim Bunce wondered if anyone had done any work on parsing Java interface declarations and (ideally) translating them to roughly equivalent Perl 6. Apparently, Gaal Yahas has done something along these lines (with Parse::RecDescent for Perl 5), but doesn't own the code. He outlined the approach he took.

Perl 6 Meta Object Protocols and $object.meta.isa(?)

Stevan Little is busy documenting the Perl 6 metamodel that he's implemented in Perl 5 and that Autrijus is busy porting to Haskell. He posted an overview to the list and asked for comment. There then followed lots of discussion. I think I understood some of it.

$object.meta.isa(?) Redux

Stevan split the discussion of $object.meta.isa(?) off from the earlier metamodel thread into a thread of its own and asked for comments once more. Larry commented that "the Apocalypses are primarily intended to be entertaining rather than factual." Also in this thread, Luke let slip that there's now a Set role in Perl 6, which has the enormous advantage of letting us specify argument types in a sensible way without having to overload the junctions.

$obj.meta.add_method('foo' => ???)>

Stevan continued discussing the metamodel with a thread about the add_method method. Autrijus was the only person with comments.

Proposed New Traits

Autrijus said that he'd started to write the inferencer and had immediately run into the problem that every type can potentially contain undef. He proposed adding an is defined trait, which would cause a variable to immediately throw an exception if anyone attempted to assign it an undefined value. He also proposed a typed trait, but I was a little less clear on why this would be a good idea. I have to confess that I didn't understand what Larry's reply was driving at, but that's okay, because Autrijus did seem to understand it.

my $pi is constant = 3

Autrijus wondered if an example of the is constant trait shown in Synopsis 6 was a special form or a typo. At least, I think that's what he was asking; I may be wearing my stupid head today, though. Larry thought it was neither. I think. It seems there's more to constancy than meets the eye. (Just ask any married couple.)

Typed Type Variables (my Foo ::x)

Stuart Cook asked about the meaning of type annotations on type variables. Autrijus answered and Thomas Sandlaß agreed with him.

BEGIN {...} and IO

Nicholas Clark commented on an earlier discussion of using IO in BEGIN blocks, pointing out that this was just a specific case of the more general problem of attempting to serialize things to bytecodes that were simply unserializable. I reckon the trick of it will be to do such things in INIT or possibly CHECK blocks (I can never remember which way round those two go).

Generic Classes

Autrijus asked about generic classes, but nobody answered before the end of the summary week. Expect Matt to address this one in the next summary.

Acknowledgements, Adverts, Apologies, and Alliteration

I'm sorry to have to say this, but I don't think I have to apologize for anything this week. WorldCon was fun.

Everything Else

Help Chip!

If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl.

Or, you can check out my website. Maybe now I'm back writing stuff I'll start updating it. There are also vaguely pretty photos by me.

Parsing iCal Data


One of the attributes of a killer application is that it does something cool: it allows you to view or organize information in new and interesting ways, saves you time, or helps you win that auction bid. Yet one of the fundamental aggravations of applications in general is that they don't always work well together; typically, you cannot send your data to mixed-and-matched applications unless they were explicitly designed to allow this. One of the great strengths of a language such as Perl is its ability to overcome these differences and act as "glue." As long as you can figure out what the incoming data looks like, and how the outgoing data should look, it is very simple to share data between previously incompatible applications. By simply building a parser between the applications and creating input files for the target application from the former's data, you extend the usefulness of your tools. In a sense, you can create killer applications out of various mundane tools on your system.

A somewhat trivial example of this sort of creation is an application that converts iCalendar data into a directed graph, readable by an application such as GraphViz. This example seems so trivial that you might ask yourself why you would wish to do such a thing. The answer is perhaps equally trivial: aside from the challenge factor, the ability to convert data could provide an alternative (or complement) to Gantt charts in project documentation, map relationships between events, etc. Moreover, by providing a simple way to allow disparate applications to interoperate, you can cumulatively build suites of applications, hopefully allowing for unforeseen advantages in the future.

Returning to the example, say you would like to take an iCal calendar (Figure 1) and turn it into an interesting visualization (Figure 2). How would you do this? Such an ability to convert formats is one step in constructing that killer application.

an iCal calendar
Figure 1. An iCal calendar

an alternate visualization
Figure 2. An alternate visualization of the calendar data

Reading the iCalendar Format

RFC 2446 defines the iCalendar format, which Apple's iCal application uses. Each iCalendar file represents an individual calendar and contains at least one block of event data in key:value tuples, starting with a BEGIN:VEVENT tuple and ending with END:VEVENT. Here is an example (with indentation added for readability) of a small iCalendar file containing two events:

BEGIN:VCALENDAR
        CALSCALE:GREGORIAN
        PRODID:-//Apple Computer\, Inc//iCal 2.0//EN
        VERSION:2.0
        BEGIN:VEVENT
                LOCATION:San Francisco
                DTSTAMP:20050618T151130Z
                UID:BDF17182-CA21-4752-8D4F-40A4FE47C90D
                SEQUENCE:8
                URL;VALUE=URI:http://developer.apple.com/wwdc/
                DTSTART;VALUE=DATE:20050606
                SUMMARY:Apple WWDC
                DTEND;VALUE=DATE:20050612
                DESCRIPTION:Lots of sessions.
        END:VEVENT
        BEGIN:VEVENT
                DURATION:PT1H
                LOCATION:Home
                DTSTAMP:20050618T151543Z
                UID:5F88A0EC-AD21-428E-AAAD-005F1B1AB72E
                SEQUENCE:6
                DTSTART;TZID=America/Chicago:20050615T180000
                SUMMARY:Set up File Server
                DESCRIPTION:Music server for the kids.
        END:VEVENT
END:VCALENDAR

There are several possible approaches to parsing the above data in Perl, but perhaps the easiest one is to create a hash of events, modeled after the iCalendar structure. With this approach, a single calendar becomes a hash of hashes with a key:value pair for each event, where the key is the event ID and the value is a hash containing the event data. While it would be just as easy to store the data as an array of hashes, the ability to pull an event by its ID allows greater flexibility and power to manipulate the data. The data for a single event might look like this:

Calendar->EventUID = { 'UID'         => EventUID,
                       'LOCATION'    => EventLocation,
                       'START'       => EventStart,
                       'END'         => EventEnd,
                       'DURATION'    => EventDuration,
                       'DTSTAMP'     => EventDatestamp,
                       'SEQUENCE'    => EventSequence,
                       'SUMMARY'     => EventSummary,
                       'DESCRIPTION' => EventDescription,
                       'URL'         => EventURL };

Note that these keys represent only a subset of all possibilities as defined in RFC 2246. Each event may not contain all of the above keys. For example, the first event in my example does not contain DURATION. Further, certain keys (such as SEQUENCE) may be irrelevant for your purposes.

With the data structure designed, what's the right way to convert iCalendar data into such a structure? Realizing the mantra of Perl, that there is more than one way to do things, perhaps the easiest approach is to match key names, starting a new event block when the parser sees BEGIN:VEVENT and ending it when END:VEVENT appears. Given the large number of possible keys, it may be easiest to use switch-like behavior. Here is an example of how to do this, splitting a key:value on the colon character (as the semicolon precedes any modifiers to the data):

SWITCH: {
        if ( $_ =~ /BEGIN:VEVENT/ ) {
                ##-----------------------------------------
                ## We have a new event, so start fresh.
                ##-----------------------------------------
                $eventHash = {};
                last SWITCH; }


        if ( $_ =~ /END:VEVENT/ ) {
                ##-----------------------------------------
                ## We hit the event end, so store it.
                ##-----------------------------------------
                $calHash->{$eventHash->{'UID'}} = 
				{
					 'UID'         => $eventHash->{'UID'},
                     'LOCATION'    => $eventHash->{'LOCATION'},
                      #...The rest of our keys...
                     'URL'         => $eventHash->{'URL'} 
				};
                last SWITCH; }


          ## we will split the key:value pair into an array 
		     and grab the value (1st element)
        if ( $_ =~ /^UID/ ) {
                $eventHash->{'UID'} = ( split ( /:/, $_ ) )[1];
                last SWITCH; }


        if ( $_ =~ /^LOCATION/ ) {
                $eventHash->{'LOCATION'} = ( split ( /:/, $_ ) )[1];
                last SWITCH; }

...The rest of our key matches...

        if ( $_ =~ /^URL/ ) {
                $eventHash->{'DESCRIPTION'} = ( split ( /:/, $_ ) )[1];
                last SWITCH; }

} # end switch

While this example does a good job of showing how to fill the data structure, it does a poor job of leveraging the power of Perl. More extensive use of regular expressions, the use of one of the Parse modules in CPAN, or even a bit of recursive programming could make this code more elegant and perhaps even a bit faster. However, these tactics may also make the code a bit harder to read--which is not always bad, unless you are attempting to explain concepts in an article. For further ideas, Toedor Zlatanov has written an article on using Perl parsing modules as well as a real mind-bender on using a functional programming approach in Perl.

The Dot Specification

Dot (PDF) is a diagramming, or directed, graph language created by Emden Gansner, Eleftherios Koutsofios, and Stephen North at Bell Labs. There are several implementations of Dot, including GraphViz, WebDot, and Grappa. Interestingly, OmniGraffle, a powerful diagramming tool for Macintosh computers, can read simple Dot files.

Creating Dot Files

The basic syntax of Dot is that there are objects or things that you describe by adding data within digraph {} braces. You denote relationships between objects with the -> combination of characters. With this code:

digraph my_first_graph {
  object1 -> object2;
}

your Dot-driven application (such as GraphViz) will display an image something like Figure 3.

a simple graph
Figure 3. A simple graph

The specification describes additional complexity in terms of sub-objects/structures, alternate shapes (the default is an oval), ranking, and more. One additional item worth noting is that Dot recognizes comments in C- and Java-style formats (// and /*). To help troubleshoot problems (and for good coding practice), I suggest that your parser insert comments into the Dot input file.

Consider how you might create a Dot file from the data parsed earlier. If you pass to the function that handles the writing of the Dot file the reference to the filehandle of your Dot input file (the output of your conversion) along with the reference to your parsed data structure, then you might generate your Dot file along these lines:

##------------------------------
  ## Name our Dot graph 
  ##------------------------------
  if ( $raw->{'CALNAME'} ) {
      print { $$file } 'digraph "'. $raw->{'CALNAME'} ."\" {\n\n";
   } elsif ( $$raw{'CALID'} ) {
      print { $$file } 'digraph "'. $raw->{'CALID'} ."\" {\n\n";
   } else {
      print { $$file } "digraph unnamed {\n\n";
   }


   ##-----------------------------------------
   ## Some optional rendering info
   ##-----------------------------------------
   print { $$file } '   size     = "10,7";'. "\n".
                    '   compound = true;'  . "\n".
                    '   ratio    = fill;'  . "\n".
                    '   rankdir  = LR;'    . "\n\n";


   ##-----------------------------------------
   ## Generate our Dot data
   ##   we will wrap most data in double-quotes 
   ##   since most Dot interpreters don't like spaces, 
   ##   something allowed in iCal data
   ##-----------------------------------------
   foreach $key ( keys %$raw ) {
      if ( ref( $raw->{$key} ) eq 'HASH' ) {
         my $block = $raw->{$key};

           ##------------------------------
           ## graphViz doesn't like - in names
           ##------------------------------
         $block->{'UID'} =~ s/-/_/g;

           ##------------------------------
           ## produce list of all unique tasks
           ##------------------------------
         push( @{ $tasks->{$block->{'SUMMARY'}} }, '"'. $block->{'UID'} .'"' );

           ##------------------------------
           ## build record
           ##------------------------------
         my $eventBlock = '"'. $block->{'UID'} .
                          '" [ shape = record, label = "'. $block->{'SUMMARY'} .
                           ' | <START> Start | <END> End ';

         if ( $block->{'DESCRIPTION'} ) {
            $eventBlock .= ' | '. $block->{'DESCRIPTION'};
         }
         $eventBlock .= '"];';

         print { $$file } '   '. $eventBlock ."\n\n";


            ##------------------------------
            ## build relations based upon time
            ##------------------------------
         push( @timeLine,    '"'. $block->{'START'} .'"' );
         print { $$file } '   "'. $block->{'UID'} .'":START  
		    -> "'. $block->{'START'} ."\"\;\n\n";

         if ( $$block{'END'} ) {
            push( @timeLine,    '"'. $block->{'END'} .'"' );
            print { $$file } '   "'. $block->{'UID'} .'":END    
			   -> "'. $block->{'END'} ."\"\;\n\n";
         }

         print { $$file } "\n\n";

}

      ##------------------------------
      ## tie non-unique tasks
      ##------------------------------
    print { $$file } '   // Create tasks relationships'. "\n\n";
    foreach ( keys %$tasks ) {
       if ( @{ $tasks->{$_} } > 1 ) {
          print { $$file } '   '. join( ' -> ', @{ $tasks->{$_} } ) ."\;\n\n";
       }
    }
    print { $$file } "\n\n";


      ##------------------------------
      ## Render our timeline
      ##------------------------------
    print { $$file } '   // Create timeline relationships'. "\n\n";
    print { $$file } '   '. join( ' -> ', sort( @timeLine )) ."\;\n\n";


      ##------------------------------
      ## Close off dot file
      ##------------------------------
    print { $$file } "}\n";

This code will produce the following Dot file:

digraph unnamed {
   size     = "10,7";
   compound = true;
   ratio    = fill;
   rankdir  = LR;

   "5F88A0EC_AD21_428E_AAAD_005F1B1AB72E" [ shape = record, 
      label = "Set up File Server | <START> Start | 
	  <END> End  | Music server for the kids."];

   "5F88A0EC_AD21_428E_AAAD_005F1B1AB72E":START  -> "20050615T180000";

   "BDF17182_CA21_4752_8D4F_40A4FE47C90D" [ shape = record, label = "WWDC | 
      <START> Start | <END> End  | Lots of sessions."];

   "BDF17182_CA21_4752_8D4F_40A4FE47C90D":START  -> "20050606";

   "BDF17182_CA21_4752_8D4F_40A4FE47C90D":END    -> "20050612";

   // Create tasks relationships

   // Create timeline relationships

   "20050606" -> "20050612" -> "20050615T180000";
}

Note that this code uses the record shape, holding individual segments within the larger object. This is slightly more complicated than the default oval that Dot uses.

Where to Go from Here

If you are using Apple's iCal application, note that the location and naming scheme of iCalendar files changed between the 1.x and 2.x releases. Previously, iCalendar files went in the ~/Library/Calendars/ directory and had names of the form <calendar name>.ics. Thus, a calendar named Work would have a file Work.ics. However, the 2.x release keeps iCalendar information in the ~/Library/Application Support/iCal/Sources/<calendar name>/ directory as sources.ics.

Other applications that implement the iCalendar specification, such as Mozilla's Calendar extension for Mozilla/Firefox/Thunderbird, may follow a different convention. On a Mac, Firefox stores .ics files in the ~/Library/Application Support/FireFox/Profiles/<profile>/Calendar/ directory, where <profile> is the profile specified in the Firefox profile.ini file. Again, other systems will likely store this information in different locations.

While on the topic of different implementations, bear in mind that, while the key:value specifications are consistent (as long as the application conforms to RFC 2246), the actual .ics file may look slightly different. For example, Firefox lays out that first event from the previous example as:

BEGIN:VEVENT
UID
 :b9794c88-1dd1-11b2-bb51-8a92011a78e8
SUMMARY
 :Apple WWDC
DESCRIPTION
 :Lots of sessions
LOCATION
 :San Francisco
URL
 :http://developer.apple.com/wwdc
STATUS
 :TENTATIVE
CLASS
 :PRIVATE
DTSTART
 ;VALUE=DATE
 :20050606
DTEND
 ;VALUE=DATE
 :20050612
DTSTAMP
 :20050618T191731Z
END:VEVENT

Here, the key:value tuples (plus any data modifiers such as VALUE=DATE) almost always split up across lines. In this case, it would be best to handle this difference when reading in the .ics file, so that the rest of the script can expect data in a generic format. One way to do this is to copy the array representing the .ics file using a finite-state machine. Another method would be to walk the array and join array elements under certain conditions, such as if the first non-white-space character of the current element begins with a colon or semicolon character, or is simply non-alphabetic.

Hopefully, this article will spur you to create a bridge between two of your favorite applications. Good luck, and please remember to share your contributions with the community.

O'Reilly Media, Inc. is rolling out a new syndication mechanism that provides greater control over the content we publish online. You'll notice some improvements immediately, such as better standards compliance, graphical tiles accompanying article descriptions, and enclosure support for podcatching applications. We've tested the new feeds using a variety of popular newsreaders and aggregators, but we realize that there may be a few bumps along the way. If you experience problems, please don't hesitate to send mail to webmaster@oreilly.com. Please include detail about your operating system and reader applications. We also welcome your suggestions. Thank you for your continued support of Perl.com.

The following URLs represent Perl.com's article and weblog content in a variety of popular formats:

Atom 1.0
http://www.oreillynet.com/pub/feed/16
RSS 1.0
http://www.oreillynet.com/pub/feed/16?format=rss1
RSS 2.0
http://www.oreillynet.com/pub/feed/16?format=rss2

We will begin automatically redirecting the existing feeds to the new feeds above, but we recommend that you update your feedreader's subscription settings to ensure continuous and uninterrupted service.

Thanks,
O'Reilly Media, Inc.'s Online Publishing Group


Return to Perl.com

This Week in Perl 6, August 2-9, 2005


All--

Welcome to another summary, brought to you by Chinese food. The attentive among you will notice that this summary is a day late, because I did not feel like doing it yesterday. If only I could do that at work.

Perl 6 Compilers

Pugs Argument Processing

Vadim Konovalov submitted a patch to Pugs affecting @*ARGS processing. In the world of Pugs, this means that he received a committer bit and applied it himself.

Type Inferencing

Autrijus wants to type push Perl 6's type inferencing as far as it can go (and maybe a little beyond). To this end, he has been soliciting input from all comers. It looks like he has put a lot of thought and research into it. One day, I expect to be thanking Autrijus for important (if likely difficult to understand) compiler errors and warnings.

WWW::Kontent Release

Brent "Dax" Royal-Gordon announced the release of WWW::Kontent 0.01: "a flexible web content management system written in Perl 6 and executable with Pugs." It looks nifty to me. Maybe we need to fight Ruby on Rails with Perl 6 on Pylons or something. That doesn't quite have the right ring to it, but there has to be something catchy there somewhere.

Array Interpolation

Phil Crow wondered why Pugs would not interpolate his arrays. Ingo Blechschmidt and Patrick explained that @foo does not interpolate, but @foo[] does. I sense a frequently asked question here.

Pugs 6.2.9 Released

Autrijus announced the release of Pugs 6.2.9. It is full of nifty new features, including the ability to lay on hands!

White Space Before Parens

Andrew Shitov wondered why Perl 6 no longer allowed white space between function names and parens. Autrijus explained that it allows print (1+2)*3 to print 9 instead of 3. As someone who just last week explained the peculiarity of Ruby printing 3 in the above situation to a complete novice, I welcome the change.

Container Model Pictures

Autrijus posted a few pretty pictures explaining the compiler model and the container model. While the compiler model was readily understandable to me, the container one wasn't. Fortunately, when prompted, Autrijus provided a great explanation to accompany the diagram.

PxPerl 5.8.7-4

Upon discovering that Pugs released a new version, Grégoire Péan released a new version of PxPerl that includes the new Pugs. I (and many others) thank Grégoire for lowering the entry bar for Perl 6 hacking on Windows.

Hosting Lexical Declarations

Declaring lexicals mid-block confuses things, especially declaring them mid-statement, as in $x = $x + my $x if $x;. Autrijus proposed hoisting declarations of lexicals to the top of the block. Unfortunately, this can make CALLER:: do funny things. Thus, he suggests outlawing it. Larry agreed.

Parrot

Export LD_LIBRARY_PATH

Bdonlan noticed that Parrot's test suite was not setting LD_LIBRARY_PATH, which makes tests fail. Leo pointed out that most users manually set their LD_LIBRARY_PATH, as Parrot often needs this, but he agreed that the tests should do it just in case.

Improved Argument Processing for ops2c.pl

Tom submitted a patch that improves the command-line argument processing powers of ops2c.pl. Warnock applies.

ANSI Escape Codes in Parrot

Klaas-Jan Stol was having trouble putting special characters like ANSI clear screen and "¥" into strings. Nick pointed out that he need to be careful with encodings and escapes. In Parrot, \O is an octal escape. In Lua, it is apparently not.

Parrot 0.2.3

Leo announced the release of Parrot 0.2.3, "Serenity," which reminds me, Firefly is coming back soon! I can't wait! Oddly, Google seems to have swallowed his release notice, but not his warnings.

Strange Filename-Based Bug

Michal Wallace found a bug that would disappear if the file was renamed. Leo, with the help of valgrind, provided Michal with a pointer. Michal used that to find a likely culprit and provide a patch, which Leo then refined.

GDBM Hash on MinGW

François Perrad provided a patch fixing gdbmhash on MinGW. Bernhard Schmalhofer applied it.

PyString Link Problem

François Perrad also fixed a link problem with pystring.o. Jonathan Worthington applied that patch.

Filling a Large Data Structure

Amir Karger wanted to know how to fill a large data structure in PIR, other than explicitly. Leo suggested reading it in from a config file.

Helping Perl 6

Rjtucke asked the ever-dangerous question, "How can I help?". Unfortunately, I think he asked it on Google Groups, and thus no one saw it.

PGE Glob Escapes

"PGE Glob Escapes; millions die before it can be rounded up again." Actually, Will Coleda noticed that he could not add a literal * to globs in PGE. Patrick fixed it so he could.

Language Test Requirements

Amir Karger has decided to write a Z-code-to-PIR translator. He wants to integrate its test suite with Parrot's language tests. Unfortunately, it does not use Test::Simple, or even Perl. Thus he wanted to know a good way to integrate it. Will Coleda, Bernhard Schmalhofer, and chromatic all provided suggestions.

mod_parrot 0.3

Adrian Lambeck provided a patch to fix src/call_list.txt for mod_parrot-0.3. chromatic applied it.

Making Makefile a Little too Clean

Patrick noticed that the Parrot build was breaking. Jonathan Worthington narrowed it down to an exact revision number. Leo realized his mistake and fixed it.

Cygwin Status

Bernhard Schmalhofer applied some old patches from Joshua Gatcomb, in the hope of improving Cygwin support. Nick Glencross provided needed Parrot Cygwin test results.

Calling SUPER Methods

Klaas-Jan Stol wondered how to call a specific parent methods (possibly bypassing child methods). Leo answered.

Compiling Pugs and Parrot

Adrian Lambeck was having trouble compiling Pugs against Parrot. Leo worked with him to find a solution, although they haven't resolved it yet.

Pure Parrot Test::Builder

chromatic has written pure-Parrot versions of Test::Builder and Test::Builder::Tester. As always, patches are welcome.

Adding a New Opcode

Gerd Pokorra wanted to know how to add a new opcode to parrot. Klaas-Jan Stol and Leo provided answers.

More Win32 Patches

François Perrad provided several patches for MinGW and Win32. Warnock applies.

Updated intro.pod

Jonathan Worthington posted an updated intro.pod. Autrijus provided a few edits, and Jonathan is planning on committing it.

Comment Fix in pir-mode.el

Jim McKim made the mistake of using Emacs. Fortunately, he counterbalanced that failing with the virtue of submitting a patch to fix an error in pir-mode.el to make the file work better. chromatic applied the patch.

Commit Bit

Curtis Rawls seemed to be having trouble using his newly acquired commit bit. Warnock applies.

Segfault with -E

Tom noticed that parrot -E segfaulted and provided a patch. He was not very confident about the patch.

make test in bc

Amir Karger noticed that make test in bc dies because he does not have antlr installed. Bernhard Schmalhofer said that he would try and fix it up to use the config-test for antlr.

interpreter.c Breakage

Amir Karger noticed that interpreter.c broke during a recent compilation. Leo pointed out that he need to make realclean.

MinGW Meets m4

François Perrad provided two patches to make m4 work on MinGW. Warnock applies.

substr Segfault

Will Coleda posted a short PIR test that will segfault in the substr opcode. This led to some discussion of variable-width encodings. Leo explained that substr was a call that would probably force Parrot to rectify variable-width encodings into fixed-width ones (which it does lazily). Then he fixed it (presumably as he had suggested).

Parrot Cygwin Meets Treefrog

Steve "treefrog" posted a patch he needed to get Cygwin testing. I think he may have posted it to Google Groups directly, though. Warnock applies.

Call Opcode Cleanups

Leo attempted to free himself from the horns of Warnock by reposting his suggested call opcode cleanup. Patrick and I voiced our support. More accurately, I voiced support and Patrick indifference.

Perl 6 Language

Complex Control Flow

Nigel Hamilton began speculating that Perl 6 might have an extremely complicated control flow. Then he began to wonder aloud about a form of control flow I can only describe as brain-melting. Luke Palmer suggested that his proposal might best start as a module.

Slurpy Parameters and Flattening

Ingo Blechschmidt's question of the flattening (or not) of slurpy params continued producing some suggestions. Piers seemed somewhat unhappy with earlier answers, but the thread died out.

Does if Topicalize?

Luke Palmer noticed if foo() -> $foo { ... } in an OSCON talk and wondered if if now topicalized. Stuart Cook offered a workaround.

Data Constructors

Luke Palmer posted his thoughts on unifying units and data constructors (as in Haskell or ML). Warnock applies.

Calling Methods on undef

Ingo Blechschmidt wondered what would happen if he called undef.chars or char undef. Brent "Dax" Royal-Gordon responded that it would return undef in the absence of use fatal. Larry confirmed this behavior.

Reassigning .ref and .meta

Ingo Blechschmidt wondered what would happen if he assigned to .ref or .meta. Luke Palmer figured that it would not be allowed. I think it should cause a large person to come over to your house and kick you. This is probably a good reason I don't write error messages.

Questioning .ref and .meta

Ingo Blechschmidt left a bunch of blanks for people to fill in with respect to .ref and .meta. Luke Palmer apparently segfaulted in the attempt to fill in the blanks.

Subscripting Pairs

Ingo Blechschmidt wondered if one could subscript pairs. Larry declared no.

Perl 6 Test in Parrot 0.2.3

Andrew Shitov was having trouble running Perl 6 under the latest Parrot. Autrijus pointed out that he was trying to run the compiler attempt abandoned in June of 2004 and then pointed him toward Pugs.

Java -> Perl ?

Tim Bunce wondered if any work had started on parsing Java interface definitions and translating them to Perl 6. Warnock applies (which probably means no).

MetaObject Questions

Stevan Little posted some of his thoughts on the MetaObject internals for comment. Many questions ensued, my eyes glazed over, the summarizer punted.

defined and typed Traits

Autrijus mused about how to deal with defined and typed traits in Perl 6. This led Larry to wonder about undef being a class, or a class being undef, or something confusing.

is constant Sugar

Autrijus wondered how is constant would desugar if it were a special form. Larry came up with suggestions, some of which said it desugared and some of which said it didn't.

The Usual Footer

To post to any of these mailing lists please subscribe by sending email to perl6-internals-subscribe@perl.org, perl6-language-subscribe@perl.org, or perl6-compiler-subscribe@perl.org. If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send feedback to

Automated GUI Testing


You use Perl at work. Sometimes you are unhappy because there is one application you always have to click on and fill all those input boxes. It's very boring. Why not let Perl do that while you go grab a coffee? Also, maybe you sometimes feel frustrated that you need to start that nice app and want someone else type in for you. Let Perl do that, too.

Why Perl?

Simply put: because you like Perl.

The long story is that there are all sorts of software packages that you may use to automate graphical applications. Are they really good fits for what you want to do?

Windows has many libraries that help you automate such things, but do the applications you use support those automation libraries? Too many do not. Moreover, is this enough for you to say you have tested a certain GUI feature? If not, read on.

What You Need

You need a working installation of Perl, with Perl/Tk included. I recommend ActiveState's ActivePerl. You also need the Win32::GuiTest module. Install it from the CPAN or, ideally, through PPM.

Example Code

Download the tester.pl and the tested.pl programs. They need to both be in the same directory. First run the tested.pl program in order to see the windows it has and how it looks. The program does nothing by itself; it just serves as a "run" application. tester.pl is more interesting. It spawns tested.pl and starts sending it input (mouse moves, mouse clicks, and keystrokes).

I tested these two programs on Windows 2000 Professional and Windows XP Home Edition using ActiveState's distribution of Perl.

The tested.pl program is just a dummy GUI used to demonstrate the examples. It uses Tk, so although it is a Win32 GUI, it isn't a native one. This has the effect that not all of the functions you can use with Win32::GuiTest will work as you would expect them to work against a native Win32 GUI. Fortunately, there are workarounds.

A Few Words About Windows

Graphical user interfaces manage windows. Windows are just reusable objects with which users can interact. Almost all GUIs have more than just one window. I use "window" just as a generic term for any graphical object that an application may produce. This means that "window" is an abstract term after all.

Windows have common elements that you need to consider before writing a program that interacts with a GUI.

  • Each window belongs to a window class (making it possible to search them by class).
  • Windows have an organizational hierarchy; every GUI has at least one root window, and every window may have child windows. Windows form a tree. This makes them searchable (by class or not) in depth: start from a root window and search among its siblings.
  • Some windows have text attached to them. This is useful to identify windows.
  • Windows have an numeric ID that uniquely identifies them.

This means that you can identify windows by any of their text, class, and parent window attributes. You can also pinpoint a window by its ID.

Finding Windows

When testing a GUI, first make sure the application you want to test has started. To do this, use the Win32::GuiTest exported function named FindWindowLike(). Remember that hierarchy of Windows? If you search for an Edit window, you may find it in the wrong place. That There can be multiple different GUIs started that have editor windows. There should be a way to differentiate between these hypothetical editor windows--and the hierarchical organization of windows helps.

First look for the main window of the application, and then descend the hierarchy (that you have to know beforehand) until you reach the desired window.

How can you know the windows hierarchy? There are two main ways. If you have written the GUI yourself or have access to its sources and have enough experience, you may find out what the hierarchy of windows is. Unfortunately, that's quite tricky and prone to error.

Another much simpler way to do this on Windows platforms is to use the free WinSpy++ program. Basically, it allows you to peek at an application's window structure.

When you use WinSpy++ to look at an application windowing structure, you will notice that every window has a numeric handle, expressed in hex. However, Perl expresses in decimal. This will come up again in a moment.

The syntax for FindWindowLike is: FindWindowLike($window,$titleregex,$classregex,$childid, $maxlevel). It returns a list of found windows. The parameters are:

  • $window

    This is the (numeric) handle of the parent window to search under (remember the hierarchical organization of windows in a GUI). You may use undef in order to search for all windows.

    $window should be a decimal value, so if you know the window's hex handle (as displayed by WinSpy++) you need to convert it.

  • $titleregex

    This is the most often used parameter. It is a regular expression for FindWindowLike to match against window titles to find the appropriate window(s).

  • $classregex

    This matches against a window class. Suppose that you want to find all buttons in an application. Use the function like this:

    my @windows = FindWindowLike(undef,"","Button");

    Note: if you don't care what the class of the window is, do not omit the $classregex parameter. Instead, use an empty string.

    Currently the FindWindowLike() function does not check if $classregex is undefined, so you will end up with a lot of Perl warnings.

  • $childid

    If you pass this argument, then the function will match all windows with this ID.

  • $maxlevel

    Maximum depth level to match windows.

As you may have noticed, the tested program has a title that matches the string "Tested". Thus, the tester starts by searching windows matching this title:

@windows = FindWindowLike( undef, "Tested", "" );

@windows will contain a list of window IDs that have a title matching the string. The point here is that you probably don't want the tested program to start more than once simultaneously.

if ( @windows > 1 ) {
     print "* The \"tested\" program is started more than once!\n";
     ...
 }

If there is no tested application already running, the program can start it and repeat the procedure, searching for windows that match our criteria (they contain the string "Tested" in their titles). If it's running just once, its ID is $windows[0]. In fact, this is the root window of the application.

There's no point in going further with the program if the GUI hasn't started, so the code checks this:

unless ( @windows ) {
     print "* The program hasn't started!\n";
     exit 1;
 }

Setting a Specific Window to Foreground

Finding a window is sometimes not enough. Often, you need to send some input to the window. Obviously, the window should be in the foreground. The appropriate functions are SetActiveWindow() and SetForegroundWindow().

Because of the way windows work under Win32, this may be trickier than it seems. Basically, if the caller is not in the foreground, it can not give another window "focus." MSDN explains this in the documentation of the SetForegroundWindow and SetActiveWindow functions.

While this behavior is easy to explain if you consider that you usually don't want applications that run in background to be able to annoy you (at least) by grabbing focus, there is at least one drawback. If you are running a GUI (perhaps remotely) to which you will send sensitive input for some reason, you may send those secrets to another, possibly malicious, application if the tested application does not have focus!

Another problem is in running tester programs remotely, or at regular intervals. Suppose that your tester program spawns the tested program, then starts sending it events (mouse events and/or keystrokes). If the computer is in a "locked" state, according to Microsoft documentation, no application can be in the foreground. You may have unexpected results.

If the GUI you are automating receives sensitive input (such as passwords), you have to find a means to "isolate" that machine's input/output devices, such as keyboard/mouse/monitor, so that no one unauthorized can peek at what your Perl program is typing in. Good luck.

In my opinion, every time you send input to a GUI, the Win32::GuiTest program should check if the application is in the foreground. If it isn't, it should try to bring it to the front. If it can't do that, it should fail and not continue.

Here's a sample routine that tester.pl uses:

 sub bring_window_to_front {
     my $window  = shift;
     my $success = 1;

     if ( SetActiveWindow($window) ) {
         print "* Successfully set the window id: $window active\n";
     }
     else {
         print "* Could not set the window id: $window active\n";
         $success = 0;
     }
     if ( SetForegroundWindow($window) ) {
         print "* Window id: $window brought to foreground\n";
     }
     else {
         print "* Window id: $window could not be brought to foreground\n";
         $success = 0;
     }

     return $success;
 }

In case you don't want to bring a window to front but expect it to be in front, use GetForegroundWindow(). That way, you can just check the return value with a window ID and find out if it is in front.

Key Pressing

You have found your window and have made sure that it has focus. What next?

It's time to send data to the window. This is the purpose of the SendKeys() function. You can send to an application not only basic keypresses, but combinations of keys too. Here's an example from the tester.pl program:

my @keys = ( "%{F}", "{RIGHT}", "E", );
for my $key (@keys) {
    SendKeys( $key, $pause_between_keypress );
}

The code starts with an array containing the keypresses. Note the format of the first three elements. The keypresses are: Alt+F, right arrow, and E. With the application open, this navigates the menu in order to open the editor.

For a full listing of "special" keystrokes or combinations of keys, consult the function's documentation.

Finding Text in Your Application

You may want to learn how you can "read" text written in GUI windows. Unfortunately, you can't read everything. You can read the text written in the title of windows (useful for identifying a window by its title). You can also read text in Edit class windows; for example, the part of Internet Explorer where you type in a URL, or the list items in a ListBox. There may be other window classes from where you can fetch text; just verify with WinSpy++ whether you can "read" from a window, before writing your program, in order to avoid frustration.

Remember that you can't (at least now) read everything written in a window. Maybe a future version of Win32::GuiTest will provide a means by which to fetch text from a window, no matter what class that window is. In my humble opinion, it would be an awesome feature.

The two functions useful for grabbing text are GetWindowText() and WMGetText(). Both take as a parameter the window ID:

$text = GetWindowText($window);
$text = WMGetText($window);

Pushing Buttons

Pushing buttons can be tricky. The syntax is PushButton($button[,$delay]), and the variable $button can be either the text of the button (its caption) or the button ID. As Piotr Kaluski points out in "Be Careful with PushChildButton," you sometimes want to specify a button ID, but instead the function matches a button having text like the one you used in the regexp. He posted a patch to the perlguitest mailing list.

Also note that when using Tk, as I do in this example, you can't identify buttons by their text--you need to use their IDs (if you know them). With native Win32 applications, you can identify buttons by their text. To check the differences, use WinSpy++ to look at a Tk button's caption and a native Win32 button's caption.

Although PushButton() works fine on native Win32 buttons, I couldn't make it work on my Tk application, so in tester.pl, I use a trick in the push_button() subroutine:

sub push_button {
    my $parent_window_title = shift;
    my @button;
    my @window;

    SendKeys("%{F}");
    SendKeys("O");
    sleep 1;

    @window = FindWindowLike( undef, $parent_window_title, "" );

    if ( !bring_window_to_front( $window[0] ) ) {
        print "* Could not bring to front $window[0]\n";
    }

    @button = FindWindowLike( $window[0], "", "Button" );
    sleep 1;

    print "* Trying to push button id: $button[0]\n";
    PushChildButton( $window[0], $button[0], 0.25 );
    sleep 1;

    click_on_the_middle_of_window( $button[0] );
}

Notice that the function depends on the tested.pl application, as it has hard-coded the way to spawn the Button window (by navigating the menu using keystrokes). It is easy to adapt it to be more flexible and to be less coupled with the rest of the code.

After sending the right combination of keys (Alt+F, O), the code expects that the window containing the Button will pop up. Then it uses FindWindowLike() again, using as a search item the title of the window containing the button (in this case, here). Remember what I said about the windows hierarchy?

Next, it ensures that the Button window has the focus, although this is not entirely necessary at this point. After bringing the window to the front, the code searches for a button in the window (I already know that there's only one button there).

@button = FindWindowLike( $window[0], "", "Button" );

This narrows down the search: "Search for a window of the class Button under the window that has the ID $window[0]," the window having the ID in $window[0] having been previously found by its title.

PushChildButton( $window[0], $button[0], 0.25 );

is here just for the power of example, as it doesn't work for the Tk button. It would work for a native Win32 button.

The trick is that the code can still push it using the mouse! Having the button ID, as returned by FindWindowLike(), the code calls the click_on_the_middle_of_window function.

sub click_on_the_middle_of_window {
    my $window = shift;
 
    print "* Moving the mouse over the window id: $window\n";
 
    my ( $left, $top, $right, $bottom ) = GetWindowRect($window);
 
    MouseMoveAbsPix( ( $right + $left ) / 2, ( $top + $bottom ) / 2 );
 
    sleep(1);
 
    print "* Left Clicking on the window id: $window\n";
    SendMouse("{LeftClick}");
    sleep(1);
}

The function takes a window ID as its parameter, searches its rectangle using GetWindowRect(), and then moves the mouse pointer right in the middle of it with MouseMoveAbsPix().

With the pointer over the button, sending LeftClick presses the button.

Moving Around with the Mouse

As seen earlier, moving the mouse is straightforward: just use MouseMoveAbsPix(). It takes as parameters the coordinates where you want the pointer to be (horizontal and vertical positions) in pixels.

It is useful to use other two functions in conjunction: SendMouse() and GetWindowRect().

SendMouse sends a mouse action to the Desktop. It takes only one parameter: a mouse action such as {LeftDown}, {LeftUp}, or {LeftClick}. For more details, see the function's documentation.

You can also move the mouse wheel using MouseMoveWheel(). It takes a positive or a negative argument, indicating the direction of the motion.

To send an action, you need to know where we send it. Usually you will move the mouse pointer over a window. GetWindowRect() is useful to find the coordinates of a window.

It can be simpler to create a wrapper around these three functions in order to move the mouse pointer over a selected window, and then generate a mouse action, as I did with click_on_the_middle_of_window().

Further Reading

Here are some links you may find useful.

This Week in Perl 6, through August 2, 2005


In case you were wondering, Darwin ports didn't work its magic and I still don't have a working Haskell compiler. Thank Juerd for feather, even if I did have to turn my laptop upside down to read the MOTD. Rot-180: oN hes +snf

This week in perl6-compiler

There were 12 messages in the compiler list this week. Either everyone has decamped to IRC or OSCON, or the compiler's getting mature enough that most of the interesting discussion about it happens in perl6-language as Autrijus and others seek clarification.

Some Thoughts on PIL/Complete Type Inferencing

Autrijus has been doing some thinking on the next version of PIL (The Pugs Intermediate Language), which will be a little less tightly coupled with PIR/Parrot. He outlined his thinking (which he seems to have directed towards being able to do useful things and optimizations with Type information) in this thread.

Definition of Containers

Autrijus announced that he'd checked in the first part of the new PIL run core. In case you were wondering, containers are the things that Perl variables have as values. They're where things like tie magic happens.

Hoisting Variable Declarations

Hands up! How does the scoping of

 
   $x = $x + my $x if $x;
   #1   #2      #3    #4
}

work in Perl 6?

In Perl 5, all those $xs refer to the same thing. In Perl 6, #1 and #2 refer to $OUTER::x.

This behavior (lexical scopes really are lexical) makes a compiler writer's head hurt. Autrijus outlined a plan for making it work.

Meanwhile, in perl6-internals

Dominance Frontier

Curtis Rawls had posted a patch adding "dominance frontiers" to IMCC. (I'm afraid I don't know what a dominance frontier is, but it sounds like it might be fun.) This week, he wondered if someone could apply it any time soon, because he had another patch that depended on it.

It turned out that the patch broke a test or two, and Will Coleda, Andy Dougherty, Patrick, and Leo set about helping to track it down. It looks like they have found the issues, and work continues to fix them.

make languages Should Continue After Building a Language Failed

Have you ever looked through the Parrot Makefiles and wondered what the deal is with .dummy? If so, this thread explains everything.

PMC Syntax

Klaas-Jan Stol asked if there's any documentation on the complete syntax for .pmc files when writing PMCs. Apparently there isn't, apart from the source of pmc2c.pl, but Will Coleda and Leo helped Klaas-Jan out.

Embedding ParTcl

Thilo Planz had some problems embedding ParTcl into a PIR application. It mostly worked, but he had a few questions. Will Coleda helped out again.

Compiling Dynamic PMCs

Klaas-Jan had more questions about compiling PMCs--dynamic ones, this time. It appears that the docs he was following didn't quite reflect reality. Leo solved the problem and Klaas-Jan sent in a doc patch. Hurrah!

Parrot Cannot Start up if STDERR or STDOUT is Closed

Michael Schwern pointed out that Parrot won't start if you close either STDOUT or STDERR, eschewing the standard joke response ("Doctor, it hurts when I do this." "Well don't do that, then.") Jerry Gay wrote a test and Leo fixed it.

Accessing Hash with Strings/Keys

Apparently, Klaas-Jan is working on writing a Lua compiler to target Parrot. He's obviously working on it a good deal at the moment. :)

He wanted to know how he could extend the standard Hash PMC to return None if there is no key found. As is traditional in these cases, Leo helped him out. It turns out that part of the problem is that pmc2c.pl isn't that strict in its syntax checking. If anyone reading this has the tuits ...

Does It Cost Anything to Use a Big PMC Everywhere?

In a move guaranteed to gladden at least Dan Sugalski's heart, Amir Karger popped up to say that he's working getting the Z-machine interpreter working. He wondered if there was any way of dedicating a register to a particular constant in order to avoid copying a global every time he called a sub. Leo helped out.

Super!

Leo announced that he'd added a new Super PMC which will make it easier to call superclass methods.

Lua Project

Klaas-Jan unveiled his project to get the Lua compiler targeting Parrot. It's apparently "far from complete," but hey, it's good to welcome yet another language to the Parrot cage.

Announcing mod_parrot 0.3

Jeff Horwitz announced the release of mod_parrot 0.3, complete with support for all Apache hooks, autogeneration of request_rec methods, and a mod_pugs proof of concept. Crumbs. And there's more. Check out the announcement, download the code, and start making Apache do weird things. Go on, you know you want to.

Meanwhile, in perl6-language

The Use and Abuse of Liskov

Damian and Luke's discussion of the right way to do MMD looks to be finally winding down. It seems Luke's convinced Damian of the righteousness of his cause. (Or at least, if he's wrong, he's wrong in a subtler way than Damian realized.) I don't think there's been a final decision as yet, but we're definitely moving forward.

Slurpy Parameters and Auto-Flattening

Ingo Blechschmidt asked for some clarification of the behavior of slurpy parameters. It's not often I hope that Luke is wrong, but I really hope the answer he gave Ingo isn't the true state of things.

Exposing the Garbage Collector

Bah! I propose a simple, slow, yet powerful feature that is useful to implement a whole bunch of other possible APIs for getting at stuff, and people go and suggest making any one of various heavier APIs the One True API. It's enough to make a person despair.

Ah, apologies, I'm letting personal concerns get in the way of the summary, but what the hell, I'm leaving it.

Messing with the Type Hierarchy

Luke had a few things to say about what happens when you monkey with the type hierarchy, so he said them. The usual suspects joined in, most of them addressing the particular instance that Luke had chosen to illustrate his point, rather than discussing the broader point, but hey, this is perl6-language. That's what happens.

Luke's broader question was, "Should it be possible to write a class that isn't a leaf in the existing hierarchy?" The example that everyone addressed was the idea of writing a Complex class that wedged in between Real and Num in the hierarchy (which, as several people pointed out, isn't necessarily the right way to think about it anyway, hence the discussion).

My gut feeling was that the answer to the general question should be "Yes, but be very, very careful, and don't be surprise if it bites you later."

Elimination of Item|Pair and Any|Junction

The discussion of appropriate default prototypes and the like continued. Autrijus proposed a way of rejigging the type hierarchy to make default argument types a little clearer. I found things getting a little weird, to be honest--there's even talk of eliminating Object as a type name, which seems a little strange.

Execution Platform Object? Gestalt?

Randal proposed that, as the number of possible platforms that Perl 6 can run on proliferates, it'd be really handy if there were some useful global that held knowledge about the platform and its capabilities. He proposed $*OS as a decent place to put it. Larry thought we probably would have something like that, but thought that there might end up being two globals: $*OS and $*VM. The usual "Why don't we call it" thread sprang up, but it seems that the most important upshot is that this particular bike shed will definitely be painted.

The Meaning of returns

The continuing invasion by the rampaging hordes from p6c continued apace. This time, Autrijus had a discussion on the implications of returns and its implications for type inferencing.

Lazy List Syntax

Flavio S. Glock wondered how to go about creating a lazy list from an object. Apparently the magic he was missing was prefix:=, which is syntactic sugar for calling the .next method on anything that supports iteration, which is nice.

An Idea for Doing pack

David Formosa had an idea about a possible pack API; he outlined it on the list. Yuval Kogman seemed to like it, but there's been nothing from anyone else on the list.

Inferring (Foo of Int).does(Foo of Any)

Autrijus again, this time thinking about the kind of type inferences that Perl aggregate types allow. Once I had my head in the right space, it made a great deal of sense, even if:

Array of Item).does(Array of Int); # false
Array of Int).does(Array of Item); # also false!

made my head hurt the first time I read it.

Garbage Collector API

Various people proposed additions to the proposed Garbage Collector API.

$value.confess()

Brent Royal-Gordon had a cunning idea for debugging: having Perl 6 capture the call stack at its point of creation and stashing that in a property called confess, which he could examine in a debugging context to find out where a value came from. As he pointed out, this would be expensive, but useful. He's currently Warnocked, but I get the feeling it should be possible to write an extension to do what he wants without adding anything extra to Perl 6 itself. It might be a little tricky if he wants the call stack to change on mutation, though.

Slurpy is rw Arrays

Having received clarification of the behavior of normal slurpy arrays, Ingo Blechschmidt asked for clarification of the behavior of Slurp is rw arrays. Adriano Ferreira and Thomas Sandlaß seemed to talk sense in reply.

Curious Use of .assuming in S06

Autrijus wondered if code like:

&textfrom := &substr.assuming(:str($text) :len(Inf))

found in Synopsis 6 was a mistake, or if the syntax should be like that. It turns out that the syntax is supposed to be like that. Apparently being able to do without the commas was one of the reasons for making colon pair syntax look like that.

Laziness and IO

In a currently Warnocked post, David Formosa outlined a potential problem with lazy IO.

sub foo ($x) returns ref($x)

In his continuing discussion of the Perl 6 type system and the inferences that you can draw about it, Autrijus posted a discussion of how to declare that a function that returns a value with the same type as its argument. He suggested that the best way forward would be to declare something like:

sub identity ($x) returns ref($x) { ... }

and asked for better suggestions. Thomas Sandlaß had suggestions.

&say's Return Value

Gaal Yahas thought that &print and &say should fail on errors and return the printed string on success (but true). Larry thought not. It looks like they'll end up returning a Boolean or throwing an exception.

$arrayref.ref

Ingo continues his ongoing task of getting clarification of the semantics of a whole host of things. This time he wanted to know about the behavior of references. Larry clarified.

Binding Scalars to Aggregates

Next up in Ingo's clarification project was binding scalars to aggregates. (Or did he mean binding aggregates to scalars?) Again, Larry came through with answers. It turns out that there's more to this than meets the eye at first glance. Autrijus's post on containers over in perl6-compiler addresses some of these issues, as well.

Binding Hashes to Arrays?

Ingo asks, "Is it legal to bind a hash to an array, or vice versa?"

Larry answers, "Not at the moment."

Module Init Hooks and Pragmas

Gaal Yahas wondered what function in a module to call when you use or no it.

Warnock applies.

Eliminating &{} and *{}

Autrijus wondered if we really need the & sigil. Warnock applies.

Stringification of Pairs

For some reason, Ingo's shortest question ("How do pairs stringify?") attracted the largest response. Well, at first blush it looks like it did. What actually happened was that Warnock claimed it, but the References: header in Andrew Shitov's posed discussed below was a little broken.

zip with ()

Some strange behavior of zip caught out Andrew Shitov. Ingo explained the problem. There was quite a bit of discussion of the various subtleties exposed.

Sometimes I pity the poor swine who's going to have to write Programming Perl 6. It's going to make the current camel look like a slim volume, if we're not careful.

Mutating map and grep

Ingo Blechschmidt wondered if it was true that Perl 6's grep, map, etc., wouldn't allow mutating values in their source array. He wondered if it would be possible to use a pragma to get the old, Perl-5-ish, behavior back. Thomas Sandlaß wondered if simply explicitly declaring the given block's argument as rw wouldn't do the job. There is no word from @Larry yet.

Acknowledgements, Adverts, Apologies, Alliteration, and Conference Envy

Damn. Couldn't think of a word beginning with "a" that means "conference." [Editor's note: attendance?] To all you lucky people in Portland at OSCON, I wish I was there and am a seething mass of envy. Well, not that seething: I'm consoling myself by going to the WorldCon in Glasgow, instead.

Help Chip

geeksunite.org: tell all your friends; this cannot stand.

The Usual Footer

If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl.

Or, you can check out my website, now running on a new engine. There are also vaguely pretty photos by me.

Building a 3D Engine in Perl, Part 4


This article is the fourth in a series aimed at building a full 3D engine in Perl. The first article started with basic program structure and worked up to displaying a simple depth-buffered scene in an OpenGL window. The second article followed with a discussion of time, view animation, SDL events, keyboard handling, and a nice chunk of refactoring. The third article continued with screenshots, movement of the viewpoint, simple OpenGL lighting, and subdivided box faces.

At the end of the last article, the engine was quite slow. This article shows how to locate the performance problem and what to do about it. Then it demonstrates how to apply the same new OpenGL technique a different way to create an on-screen frame rate counter. As usual, you can follow along with the code by downloading the sample code.

SDL_perl Developments

First, there is some good news--Win32 users are no longer left out in the cold. Thanks to Wayne Keenan, SDL_perl 1.x now fully supports OpenGL on Win32, and prebuilt binaries are available. There are more details at the new SDL_perl 1.x page on my site; browse the Subversion repository at svn.openfoundry.org/sdlperl1.

If you'd like to help in the efforts to improve SDL_perl 1.x, please come visit the SDL_perl 1.x page, check out the code and send me comments or patches, or ping me in #sdlperl on irc.freenode.net.

Benchmarking the Engine

As I mentioned in the introduction, when last I left off, the engine pretty much crawled. It's time to figure out why and figure out what to do about it. The right tool for the first job is a profiler, which watches a running program and keeps track of the performance of each part of it. Perl's native profiler is dprofpp, which tracks time spent and call count for every subroutine in the program. Examining these numbers will reveal if the engine spends most of its time in one routine, which will then be the focus for optimization.

It's best if these numbers are relatively repeatable from run to run, making it easy to compare profiles before and after a change. For a rendering engine, the easiest solution is a benchmark mode. In benchmark mode, the engine runs for a set period of time or number of frames, displaying a predefined scene or sequence. I chose to enable benchmark mode with a new setting in init_conf:

benchmark => 1,

The engine already displays a constant scene as long as the user doesn't press any keys; the remaining requirement is to quit after a set period.

In previous articles I've simply hardcoded an out-of-time check into the rendering loop, but this time I opted for a more general approach, using triggered events. Engine events so far have always come from SDL in response to external input, such as key presses and window close events. In contrast, the engine itself produces triggered events in response to changes in the state of the simulated world, such as a player attempting to open a door or attack an enemy.

To gather these events, I added two new lines to the beginning of do_events; the opening lines are now:

sub do_events
{
    my $self = shift;

    my $queue     = $self->process_events;
    my $triggered = $self->triggered_events;
    push @$queue, @$triggered;

After processing the SDL events with process_events and stuffing the resulting commands into the $queue, do_events calls triggered_events to gather commands from any pending internally generated events and adds them to the $queue. triggered_events can be pretty simple for now:

sub triggered_events
{
    my $self = shift;

    my @queue;
    push @queue, 'quit' if $self->{conf}{benchmark} and
                           $self->{world}{time} >= 5;
    return \@queue;
}

This is pretty much a direct translation of the old hardcoded timeout code to the command queue concept. Normally triggered_events simply returns an empty arrayref, indicating no events were triggered, and therefore no commands generated. Benchmark mode adds a quit command to the queue as soon as the world time reaches 5 seconds. Normal command processing in do_events will take care of the rest.

dprofpp is Your (Obtuse) Friend

With benchmark mode enabled, the engine runs under dprofpp. The first step is to collect the profile data:

dprofpp -Q -p step065

-p step065 tells dprofpp to profile the program named step065, and -Q tells it to quit after collecting the data. dprofpp ran step065, collected the profile data, and stored it in a specially formatted text file named tmon.out in the current directory.

To turn the profile data into human-readable output, I used dprofpp without any arguments. It crunched the collected data for a while and finally produced this:

$ dprofpp
Exporter::Heavy::heavy_export_to_level has 4 unstacked calls in outer
Exporter::export_to_level has -4 unstacked calls in outer
Exporter::export has -12 unstacked calls in outer
Exporter::Heavy::heavy_export has 12 unstacked calls in outer
Total Elapsed Time = 4.838377 Seconds
  User+System Time = 1.498377 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 88.1   1.320  1.320      1   1.3200 1.3200  SDL::SetVideoMode
 38.1   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face
 16.0   0.240  0.341      8   0.0300 0.0426  SDL::OpenGL::BEGIN
 13.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex
 11.3   0.170  0.170      1   0.1700 0.1700  DynaLoader::dl_load_file
 9.34   0.140  0.020     12   0.0116 0.0017  Exporter::export
 6.67   0.100  0.100   1001   0.0001 0.0001  SDL::in
 4.00   0.060  0.060      1   0.0600 0.0600  SDL::Init
 3.34   0.050  0.847      8   0.0062 0.1059  main::BEGIN
 2.00   0.030  0.040      5   0.0060 0.0080  SDL::Event::BEGIN
 1.80   0.027  0.801     49   0.0005 0.0163  main::draw_cube
 1.47   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End
 1.33   0.020  0.020      1   0.0200 0.0200  warnings::BEGIN
 1.33   0.020  0.020     16   0.0012 0.0012  Exporter::as_heavy
 1.33   0.020  0.209      5   0.0040 0.0418  SDL::BEGIN

There are several problems with this output. The numbers are clearly silly (88 percent of its time spent in SDL::SetVideoMode?), the statistics for the various BEGIN blocks are inconsequential to the task and in the way, and the error messages at the top are rather disconcerting. To fix these issues, dprofpp has the -g option, which tells dprofpp to only display statistics for a particular routine and its descendants:

$ dprofpp -g main::main_loop
Total Elapsed Time = 4.952042 Seconds
  User+System Time = 0.812051 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 70.3   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face
 24.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex
 3.32   0.027  0.801     49   0.0005 0.0163  main::draw_cube
 2.71   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End
 1.23   0.010  0.010     49   0.0002 0.0002  SDL::OpenGL::Rotate
 1.11   0.009  0.009      7   0.0013 0.0013  main::prep_frame
 1.11   0.009  0.009     70   0.0001 0.0001  SDL::OpenGL::Color
 0.25   0.002  0.002   2947   0.0000 0.0000  SDL::OpenGL::Begin
 0.00       - -0.000      1        -      -  main::action_quit
 0.00       - -0.000      2        -      -  SDL::EventType
 0.00       - -0.000      2        -      -  SDL::Event::type
 0.00       - -0.000      7        -      -  SDL::GetTicks
 0.00       - -0.000      7        -      -  SDL::OpenGL::Clear
 0.00       - -0.000      7        -      -  SDL::OpenGL::GL_NORMALIZE
 0.00       - -0.000      7        -      -  SDL::OpenGL::GL_SPOT_EXPONENT

You may have noticed that I specified main::main_loop instead of just main_loop. dprofpp always uses fully qualified names and will give empty results if you use main_loop without the main:: package qualifier.

In this exclusive times view, the percentages in the first column and the row order depend only on the runtime of each routine, without respect to its children. Using just this view, I might have tried to optimize draw_quad_face somehow, as it appears to be the most expensive routine by a large margin. That's not the best approach, however, as an inclusive view (-I) shows:

$ dprofpp -I -g main::main_loop
Total Elapsed Time = 4.952042 Seconds
  User+System Time = 0.812051 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 100.       -  0.814      7        - 0.1163  main::do_frame
 99.9       -  0.812      1        - 0.8121  main::main_loop
 99.7       -  0.810      7        - 0.1158  main::draw_view
 99.2       -  0.806      7        - 0.1151  main::draw_frame
 98.6   0.027  0.801     49   0.0005 0.0163  main::draw_cube
 95.3   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face
 24.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex
 2.71   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End
 1.23   0.010  0.010     49   0.0002 0.0002  SDL::OpenGL::Rotate
 1.11   0.009  0.009     70   0.0001 0.0001  SDL::OpenGL::Color
 1.11   0.009  0.009      7   0.0013 0.0013  main::prep_frame
 0.25   0.002  0.002   2947   0.0000 0.0000  SDL::OpenGL::Begin
 0.00       - -0.000      1        -      -  main::action_quit
 0.00       - -0.000      2        -      -  SDL::EventType
 0.00       - -0.000      2        -      -  SDL::Event::type

In this view, draw_quad_face looks even worse, because the first column now includes the time taken by all of the OpenGL calls inside of it, including tens of thousands of glVertex calls. It seems that I should do something to speed it up, but at this point it's not entirely clear how to simplify it or reduce the number of OpenGL calls it makes (other than reducing the subdivision level of each face, which would reduce rendering quality).

Actually, there's a better option. The real problem is that draw_cube dominates the execution time, and draw_quad_face dominates that. How about not calling draw_cube (and therefore draw_quad_face) at all during normal rendering? It seems extremely wasteful to have to tell OpenGL how to render a cube face dozens of times each frame. If only there were a way to tell OpenGL to remember the cube definition once, and just refer to that definition each time the engine needs to draw it.

Display Lists

I expect no one will find it surprising that OpenGL provides exactly this function, with the display lists facility. A display list is a list of OpenGL commands to execute to perform some function. The OpenGL driver stores it (sometimes in a mildly optimized format) and further code refers to it by number. Later, the program can request that OpenGL run the commands in some particular list as many times as desired. Lists can even call other lists; a bicycle model might call a wheel display list twice, and the wheel display list might itself call a spoke display list dozens of times.

I added init_models to create display lists for each shape I want to model:

sub init_models
{
    my $self = shift;

    my %models = (
        cube => \&draw_cube,
    );
    my $count  = keys %models;
    my $base   = glGenLists($count);
    my %display_lists;

    foreach my $model (keys %models) {
        glNewList($base, GL_COMPILE);
        $models{$model}->();
        glEndList;

        $display_lists{$model} = $base++;
    }

    $self->{models}{dls} = \%display_lists;
}

%models associates each model with the code needed to draw it. Because the engine already knows how to draw a cube, I simply reused draw_cube here. The next two lines begin the work of building the display lists. The code first determines how many display lists it needs and then calls glGenLists to allocate them. OpenGL numbers the allocated lists in sequence, returning the first number in the sequence (the list base). For example, if the code had requested four lists, OpenGL might have numbered them 1051, 1052, 1053, and 1054, and would then return 1051 as the list base.

For each defined model, init_models calls glNewList to tell OpenGL that it is ready to compile a new display list at the number $base. OpenGL then prepares to convert any subsequent OpenGL calls to entries in the list, rather than rendering them immediately. If I had chosen GL_COMPILE_AND_EXECUTE instead of GL_COMPILE, OpenGL would perform the rendering and save the calls in the display list at the same time. GL_COMPILE_AND_EXECUTE is useful for on-the-fly caching when code needs active rendering anyway. Because init_models is simply precaching the rendering commands and nothing should render while this occurs, GL_COMPILE is the better choice.

The code then calls the drawing routine, which conveniently submits all of the OpenGL calls needed for the new list. The call to glEndList then tells OpenGL to stop recording entries in the display list and return to normal operation. The model loop then records the display list number used by the current model in the %display_lists hash, and increments $base for the next iteration. After processing all of the models, init_models saves %display_lists into a new structure in the engine object.

init calls init_models just before init_objects:

$self->init_models;
$self->init_objects;

With this initialization in place, the next step was to change draw_view to draw from either a model or a draw routine. To do this, I replaced the $o->{draw}->() call with:

    if ($o->{model}) {
        my $dl = $self->{models}{dls}{$o->{model}};
        glCallList($dl);
    }
    else {
        $o->{draw}->();
    }

If the object has an associated model, draw_view looks up the display list in the hash created by init_models, and then calls the list using glCallList. Otherwise, draw_view falls back to calling the object's draw routine as before. A quick run confirmed that the fallback works and adding init_models didn't break anything, so it was safe to change init_objects to use models instead of draw routines for the cubes. This involved replacement of just three lines--I changed each copy of:

        draw        =& \&draw_cube,

to:

        model       =& 'cube',

Suddenly, the engine was much faster and more responsive. A dprofpp run confirmed this:

$ dprofpp -Q -p step068

Done.
$ dprofpp -I -g main::main_loop
Total Elapsed Time = 4.053240 Seconds
  User+System Time = 0.973250 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 99.9       -  0.973      1        - 0.9733  main::main_loop
 86.5   0.024  0.842    413   0.0001 0.0020  main::do_frame
 58.1   0.203  0.566    413   0.0005 0.0014  main::draw_view
 56.9   0.016  0.554    413   0.0000 0.0013  main::draw_frame
 20.1   0.196  0.196    413   0.0005 0.0005  SDL::GLSwapBuffers
 19.3       -  0.188    413        - 0.0005  SDL::App::sync
 18.4       -  0.180    413        - 0.0004  main::end_frame
 16.7   0.163  0.163   2891   0.0001 0.0001  SDL::OpenGL::CallList
 9.14   0.028  0.089    413   0.0001 0.0002  main::do_events
 8.53   0.035  0.083    413   0.0001 0.0002  main::prep_frame
 6.68   0.008  0.065    413   0.0000 0.0002  main::process_events
 5.03   0.049  0.049   3304   0.0000 0.0000  SDL::OpenGL::GL_LIGHTING
 4.93   0.002  0.048    413   0.0000 0.0001  SDL::Event::pump
 4.73   0.046  0.046    413   0.0001 0.0001  SDL::PumpEvents
 4.11   0.012  0.040    413   0.0000 0.0001  main::update_time

Note that I had to run dprofpp -Q -p again with the new code before doing the analysis, or dprofpp would have just reused the old tmon.out.

The first thing to note in this report is that previously the engine only managed seven frames (calls to do_frame) before timing out, but now managed 413 in the same time! Secondly, as intended, main_loop never calls draw_cube, having replaced all such calls with calls to glCallList. Because of this it is no longer necessary to do many thousands of low-level OpenGL calls to draw the scene each frame, with the attendant Perl and XS overhead. Instead, the OpenGL driver handles all of those calls internally, with minimal overhead.

This has the added advantage that it is now feasible to run the engine on one computer and display the window on another, as the OpenGL driver on the displaying computer saves the display lists. Once init_models compiles the display lists, they are loaded into the display driver, and future frames require minimal network traffic to handle glCallList. (Adventurous users running X can do this by logging in locally to the display computer, sshing to the computer that has the engine and SDL_perl on it, and running the program there. If your ssh has X11 forwarding turned on, your reward should be a local window. And there was much rejoicing.)

An FPS Counter

The measurements that dprofpp performs have enough overhead to significantly reduce the engine's apparent performance. (Even old hardware can do better than 80-100 FPS with this simple scene.) The overhead is necessary to get a detailed analysis, but when it comes time to show off, most users want to have a nice frame rate display showing the performance of the engine running as fast as it can.

Making a frame rate display requires the ability to render text in front of the scene. The necessary pieces of that are:

  1. A font containing glyphs for the characters to display (at least 0 through 9).
  2. A font reader to load the font from a file into memory as bitmaps.
  3. A converter from raw bitmaps to a format that OpenGL can readily display.
  4. A way to render the proper bitmaps for a given string.
  5. A way to calculate the current frame rate.

The Numbers Font

There are hundreds of freely available fonts, but most of them are available only in fairly complex font formats such as TrueType and Type 1. Some versions of SDL_perl support these complex font formats, but this support has historically been frustratingly buggy or incomplete.

Given the relatively simple requirement (render a single integer), I chose instead to create a very simple bitmapped font format just for this article. The font file is numbers-7x11.txt in the examples tarball. It begins as follows:

7x11

30
..000..
.0...0.
.0...0.
0.....0
0.....0
0.....0
0.....0
0.....0
.0...0.
.0...0.
..000..

31
...0...
..00...
00.0...
...0...
...0...
...0...
...0...
...0...
...0...
...0...
0000000

The first line indicates the size of each character cell in the font; in this case, seven columns and 11 rows. The remaining chunks each consist of the character's codepoint in hex followed by a bitmap represented as text--. represents a transparent pixel, and 0 represents a rendered pixel. Empty lines separate chunks.

The Font Reader

To read the glyph definitions into bitmaps, I first added read_font_file:

sub read_font_file
{
    my $self = shift;
    my $file = shift;

    open my $defs, '<', $file
        or die "Could not open '$file': $!";
    local $/ = '';

    my $header  = <$defs>;
    chomp($header);
    my ($w, $h) = split /x/ =& $header;

    my %bitmaps;
    while (my $def = <$defs>) {
        my ($hex, @rows) = grep /\S/ =& split /\n/ =& $def;

        @rows = map {tr/.0/01/; pack 'B*' =& $_} @rows;

        my $bitmap           = join '' =& reverse @rows;
        my $codepoint        = hex $hex;

        $bitmaps{$codepoint} = $bitmap;
    }

    return (\%bitmaps, $w, $h);
}

read_font_file begins by opening the font file for reading. It next requests paragraph slurping mode by setting $/ to ''. In this mode, Perl automatically breaks up the font file at empty lines, with the header first followed by each complete glyph definition as a single chunk. Next, the routine reads the header, chomps it, and splits the cell size definition into width and height.

With the preliminaries out of the way, read_font_file creates a hash to store the finished bitmaps and enters a while loop over the remaining chunks of the font file. Each glyph definition is split into a hex number and an array of bitmap rows; using grep /\S/ =& ignores any trailing blank lines.

The next line converts textual rows to real bitstrings. First, each transparent pixel (.) becomes 0, and each rendered pixel (0) turns into a 1. Feeding the resulting binary text string to pack 'B*' converts the binary into an actual bitstring, with the bits packed in starting from the high bit of each byte (as OpenGL prefers). The resulting bitstrings are stored back in @rows.

Because OpenGL prefers bitmaps to start at the bottom and go up, the code reverses @rows before joining to create the finished bitmap. The hex operator converts the hex number to decimal to be the key for the newly created bitmap in the %bitmaps hash.

After parsing the whole font file, the function returns the bitmaps to the caller, along with the cell size metrics.

Speaking OpenGL's Language

The bitmaps produced by read_font_file are simply packed bitstrings, in this case 11 bytes long (one byte per seven-pixel row). Before using them to render strings, the engine must first load these bitmaps into OpenGL. This happens in the main init_fonts routine:

sub init_fonts
{
    my $self  = shift;

    my %fonts = (
        numbers =& 'numbers-7x11.txt',
    );

    glPixelStore(GL_UNPACK_ALIGNMENT, 1);

    foreach my $font (keys %fonts) {
        my ($bitmaps, $w, $h) = 
            $self->read_font_file($fonts{$font});

        my @cps    = sort {$a <=& $b} keys %$bitmaps;
        my $max_cp = $cps[-1];
        my $base   = glGenLists($max_cp + 1);

        foreach my $codepoint (@cps) {
            glNewList($base + $codepoint, GL_COMPILE);
            glBitmap($w, $h, 0, 0, $w + 2, 0,
                     $bitmaps->{$codepoint});
            glEndList;
        }

        $self->{fonts}{$font}{base} = $base;
    }
}

init_fonts opens with a hash associating each known font with a font file; at the moment, only the numbers font is defined. The real work begins with the glPixelStore call, which tells OpenGL that the rows for all bitmaps are tightly packed (along one-byte boundaries) rather than being padded, so that each row begins at even two-, four-, or eight-byte memory locations.

The main font loop starts by calling read_font_file to load the bitmaps for the current font into memory. The next line sorts the codepoints into @cps, and the following line finds the maximum codepoint by simply taking the last one in @cps.

The glGenLists call allocates display lists for codepoints 0 through $max_cp, which will have numbers from $base through $base + $max_cp. For each codepoint defined by the font, the inner loop uses glNewList to start compiling the appropriate list, glBitmap to load the bitmap into OpenGL, and finally, glEndList to finish compiling the list.

The glBitmap call has six parameters aside from the bitmap data itself ($bitmaps->{$codepoint}). The first two are the width and height of the bitmap in pixels, which read_font_file conveniently provides. The next two define the origin for the bitmap, counted from the lower-left corner. Bitmap fonts use a non-zero origin for several purposes, generally when the glyph extends farther left or below the "normal" lower-left corner. This may be because the glyph has a descender (a part of the glyph that descends below the general line of text, as with the lowercase letters "p" and "y"), or perhaps because the font leans to the left. The simple code in init_fonts assumes none of these special cases apply and sets the origin to (0,0).

The last two parameters are the X and Y increments, the distances that OpenGL should move along the X and Y axes before drawing the next character. Left-to-right languages use fonts with positive X and zero Y increments; right-to-left languages use negative X and zero Y. Top-to-bottom languages use zero X and negative Y. The increments must include both the width/height of the character itself and any additional distance needed to provide proper spacing. In this case, the rendering will be left to right. I wanted two extra pixels for spacing, so I set the X increment to width plus two, and the Y increment to zero.

The last line of the outer loop simply saves the list base for the font to make it available later during rendering.

init calls init_fonts as usual, just after the call to init_time:

$self->init_fonts;

Text Rendering

The hard part is now done: parsing the font file and loading the bitmaps into OpenGL. The new draw_fps routine calculates and renders the frame rate:

sub draw_fps
{
    my $self   = shift;

    my $base   = $self->{fonts}{numbers}{base};
    my $d_time = $self->{world}{d_time} || 0.001;
    my $fps    = int(1 / $d_time);

    glColor(1, 1, 1);
    glRasterPos(10, 10, 0);
    glListBase($base);
    glCallListsScalar($fps);
}

The routine starts by retrieving the list base for the numbers font, retrieving the world time delta for this frame, and calculating the current frames per second as one frame in $d_time seconds. It takes a little care to make sure $d_time is non-zero, even if the engine is running so fast that it renders a frame in less than a millisecond (the precision of SDL time handling); otherwise, the $fps calculation would die with a divide-by-zero error.

The OpenGL section begins by setting the current drawing color to white with a call to glColor. The next line sets the raster position, the window coordinates at which to place the origin of the next bitmap. After rendering each bitmap, the raster position is automatically updated using the bitmap's X and Y increments so that the bitmaps will not overlap each other. In this case, (10, 10, 0) sets the raster position ten pixels up and right from the lower-left corner of the window, with Z=0.

The next two lines together actually call the appropriate display list in our bitmap font for each character in the $fps string. glCallListsScalar breaks the string into individual characters and calls the display list with the same number as the codepoint of the character. For example, for the "5" character (at codepoint 53 decimal), glCallListsScalar calls display list 53. Unfortunately, there's no guarantee that display list 53 actually will display a "5," because the font's list base may not be 0. If the font had a list base of 1500, for example, the code would need to call display list 1500+53=1553 to display the "5."

Rather than make the programmer do this calculation manually every time, OpenGL provides the glListBase function, which sets the list base to use with glCallLists. After the glListBase call above, OpenGL will automatically offset every display list number specified with glCallLists by $base.

You may have noticed that in the code I use glCallListsScalar, but the previous paragraph referred to glCallLists instead. glCallListsScalar is actually an SDL_perl extension (not part of core OpenGL) that provides an alternate calling convention for glCallLists in Perl. Internally, SDL_perl implements both Perl routines using the same underlying C function in OpenGL (glCallLists). SDL_perl provides two different calling conventions because Perl treats a string and an array of numbers as two different things, while C treats them as essentially the same.

If you want to render a string, and all of the characters in the string have codepoints <= 255 decimal (single-byte character sets, and the ASCII subset of most variable-width character sets), you can use glCallListsScalar, and it will do the right thing for you:

glCallListsScalar($string);

If you simply want to render several display lists with a single call, and you're not trying to render a string, use the standard version of glCallLists:

glCallLists(@lists);

If you need to render a string, but it contains characters above codepoint 255, you have to use a more complex workaround:

glCallLists(map ord($_) =& split // =& $string);

Because the FPS counter merely renders ASCII digits, the first option works fine.

draw_frame now ends with a call to draw_fps, like so:

sub draw_frame
{
    my $self = shift;

    $self->set_projection_3d;
    $self->set_eye_lights;
    $self->set_view_3d;
    $self->set_world_lights;
    $self->draw_view;
    $self->draw_fps;
}

For now, I decided to turn off benchmark mode by changing the config setting in init_config to 0:

    benchmark =& 0,

With the font handling in place, and draw_fps called each frame to display the frame rate in white in the lower-left corner, everything should be grand, as Figure 1 shows.

drawing frame rate, take one
Figure 1. Drawing the frame rate

Oops. There's no frame rate display. Actually, it's there, just very faint. If you look very carefully (or turn your video card's gamma up very high), you can just make out the frame rate display near the top of the window, above the big white box on the right. There are (at least) two problems--the text is too dark and it's in the wrong place.

The first problem is reminiscent of the dark scene in the last article, after enabling lighting but no lights. Come to think of it, there's not much reason to have lighting enabled just to display stats, but the last object rendered by draw_view left it on. To make sure lighting is off, I added a set_lighting_2d routine, which draw_frame now calls just before calling draw_fps:

sub set_lighting_2d
{
    glDisable(GL_LIGHTING);
}

the unlit frame rate
Figure 2. The unlit frame rate

Figure 2 is much better! With lighting turned off, the frame rate now renders in bright white as intended. The next problem is the incorrect position. Moving and rotating the viewpoint shows that while the digits always face the screen, their apparent position moves around (Figure 3).

moving frame rate
Figure 3. A moving frame rate

It turns out that the current modelview and projection matrices transform the raster position set by glRasterPos, just like the coordinates from a glVertex call. That means OpenGL reuses whatever state the modelview and projection matrices are in.

To get unaltered window coordinates, I need to use an orthographic projection (no foreshortening or other non-linear effects) matching the window dimensions. I also need to set an identity modelview matrix (so that the modelview matrix won't transform the coordinates at all). All of this happens in set_projection_2d, called just before set_lighting_2d in draw_frame:

sub set_projection_2d
{
    my $self = shift;

    my $w    = $self->{conf}{width};
    my $h    = $self->{conf}{height};

    glMatrixMode(GL_PROJECTION);
    glLoadIdentity;
    gluOrtho2D(0, $w, 0, $h);

    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity;
}

This routine first gathers the window width and height from the configuration hash. It then switches to the projection matrix (GL_PROJECTION) and restores the identity state before calling gluOrtho2D to create an orthographic projection matching the window dimensions. Finally, it switches back to the modelview matrix (GL_MODELVIEW) and restores its identity state as well. The frame rate now renders at the intended spot near the lower-left corner (Figure 4).

frame rate in the right spot
Figure 4. The frame rate in the correct position

There is another more subtle rendering problem, however, which you can see by moving the viewpoint forward a bit (Figure 5).

frame rate depth problems
Figure 5. Frame rate depth problems

Notice how the "5" is partially cut off. The problem is that OpenGL compares the depth of the pixels in the thin yellow box to the depth of the pixels in the frame rate display, and finds that some of the pixels in the 5 are farther away than the pixels in the box. In effect, part of the 5 draws inside the box. In fact, moving the viewpoint slightly to the left from this point will make the frame rate disappear altogether, hidden by the near face of the yellow box.

That's not very good behavior from a statistics display that should appear to hover in front of the scene. The solution is to turn off OpenGL's depth testing, using a new line at the end of set_projection_2d:

glDisable(GL_DEPTH_TEST);

With this change, you can move the view anywhere without fear that the frame rate will be cut off or disappear entirely (Figure 6).

position-independent frame rate
Figure 6. Position-independent frame rate

Too Fast

There's yet another problem; this time, one that will require a change to the frame rate calculations. The frame rate shown in the above screenshots is either 333 or 500, but nothing else. On this system, the frames take between two and three milliseconds to render, but because SDL can only provide one-millisecond resolution, the time delta for a single frame will appear to be exactly either .002 second or .003 second. 1/.002=500, and 1/.003=333, so the display is a blur, flashing back and forth between the two possible values.

To get a more representative (and easier-to-read) value, the code must average frame rate over a number of frames. Doing this will allow the total measured time to be long enough to drown out the resolution deficiency of SDL's clock.

The first thing I needed was a routine to initialize the frame rate data to carry over multiple frames:

sub init_fps
{
    my $self = shift;

    $self->{stats}{fps}{cur_fps}    = 0;
    $self->{stats}{fps}{last_frame} = 0;
    $self->{stats}{fps}{last_time}  = $self->{world}{time};
}

The new stats structure in the engine object will hold any statistics that the engine gathers about itself. To calculate FPS, the engine needs to remember the last frame for which it took a timestamp, as well as the timestamp for that frame. Because the engine calculates the frame rate only every few frames, it also saves the last calculated FPS value so that it can render it as needed. The init_fps call, as usual, goes at the end of init:

$self->init_fps;

The new update_fps routine now calculates the frame rate:

sub update_fps
{
    my $self      = shift;

    my $frame     = $self->{state}{frame};
    my $time      = $self->{world}{time};

    my $d_frames  = $frame - $self->{stats}{fps}{last_frame};
    my $d_time    = $time  - $self->{stats}{fps}{last_time};
    $d_time     ||= 0.001;

    if ($d_time >= .2) {
        $self->{stats}{fps}{last_frame} = $frame;
        $self->{stats}{fps}{last_time}  = $time;
        $self->{stats}{fps}{cur_fps}    = int($d_frames / $d_time);
    }
}

update_fps starts by gathering the current frame number and timestamp, and calculating the deltas from the saved values. Again, $d_time must default to 0.001 second to avoid possible divide-by-zero errors later on.

The if statement checks to see if enough time has gone by to result in a reasonably accurate frame rate calculation. If so, it sets the last frame number and timestamp to the current values and the current frame rate to $d_frames / $d_time.

The update_fps call must occur early in the main_loop, but after the engine has determined the new frame number and timestamp. main_loop now looks like this:

sub main_loop
{
    my $self = shift;

    while (not $self->{state}{done}) {
        $self->{state}{frame}++;
        $self->update_time;
        $self->update_fps;
        $self->do_events;
        $self->update_view;
        $self->do_frame;
    }
}

The final change needed to enable the new more accurate display is in draw_fps; the $d_time lookup goes away and the $fps calculation turns into a simple retrieval of the current value from the stats structure:

my $fps  = $self->{stats}{fps}{cur_fps};

The more accurate calculation now makes it easy to see the difference between the frame rate for a simple view (Figure 7):

frame rate for a simple view
Figure 7. Frame rate for a simple view

and the frame rate for a more complex view (Figure 8).

frame rate for a complex view
Figure 8. Frame rate for a complex view

Is the New Display a Bottleneck?

The last thing to do is to check that the shiny new frame rate display is not itself a major bottleneck. The easiest way to do that is to turn benchmark mode back on in init_conf:

    benchmark =& 1,

After doing that, I ran the engine under dprofpp again, and then analyzed the results, just as I had earlier:

$ dprofpp -Q -p step075

Done.
$ dprofpp -I -g main::main_loop
Total Elapsed Time = 3.943764 Seconds
  User+System Time = 1.063773 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 100.       -  1.064      1        - 1.0638  main::main_loop
 94.6   0.006  1.007    384   0.0000 0.0026  main::do_frame
 85.2   0.019  0.907    384   0.0000 0.0024  main::draw_frame
 50.7   0.205  0.540    384   0.0005 0.0014  main::draw_view
 16.8   0.073  0.179    384   0.0002 0.0005  main::draw_fps
 15.4   0.095  0.164    384   0.0002 0.0004  main::set_projection_2d
 11.6   0.045  0.124    384   0.0001 0.0003  main::draw_axes
 10.9   0.116  0.116   2688   0.0000 0.0000  SDL::OpenGL::CallList
 8.74   0.013  0.093    384   0.0000 0.0002  main::end_frame
 7.52   0.003  0.080    384   0.0000 0.0002  SDL::App::sync
 7.24   0.077  0.077    384   0.0002 0.0002  SDL::GLSwapBuffers
 4.89   0.052  0.052   3072   0.0000 0.0000  SDL::OpenGL::PopMatrix
 4.70   0.023  0.050    384   0.0001 0.0001  main::update_view
 3.67   0.039  0.039   3456   0.0000 0.0000  SDL::OpenGL::GL_LIGHTING
 3.48   0.037  0.037    384   0.0001 0.0001  SDL::OpenGL::Begin

As it currently stands, draw_view takes half of the run time of main_loop, and the combination of set_projection_2d and draw_fps takes about a third of the main_loop time together. Is that good or bad news?

draw_view is so quick now because I've just optimized it. Now that it's running so fast again, I can afford to add more features and perhaps make a more complex scene, either of which will make draw_view take a larger percentage of the time again. Also, set_projection_2d is necessary for any in-window statistics, debugging, or HUD (heads up display) anyway, so the time spent there will not go to waste.

That leaves draw_fps, taking about one sixth of main_loop's run time. That's perhaps a bit larger than I'd like, but not large enough to warrant additional effort yet. I'll save my energy for the next set of features.

Conclusion

During this article, I covered several concepts relating to engine performance: adding a benchmark mode; profiling with dprofpp; using display lists to optimize slow, repetitive rendering tasks; and using display lists, bitmapped fonts, and averaging to produce a smooth frame rate display. I also added a stub for a triggered events subsystem, which I'll come back to in a future article.

With these performance improvements, the engine is ready for the next new feature, textured surfaces, which will be the main topic for the next article.

Until then, enjoy yourself and have fun hacking!

Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en