February 2005 Archives

This Fortnight in Perl 6, Feb. 9-22, 2005

Welcome to yet another fortnight summary. Lately, p6l has outstripped p6i in volume. While this used to be the norm, it has become a rare occurrence. Strange. Anyway, this summary would be brought to you by cookies, but I ate them all. So instead this summary is brought to you by the remaining chocolate chips. In other news, Autrijus Tang has just officially received a promotion to first-name-only status in these summaries, based on both his stellar work with Pugs and his highly identifiable name. He now joins the ranks of Larry, Dan, Madonna, and Leo.

Perl 6 Language

do { } while?

David Storrs wanted to know the best way to say do { } while($foo);. Larry told him that s/do/loop/ would suffice.

nest as a Primitive Looping Operation

Timothy Nelson receives credit for resurrecting the oldest thread I have seen brought back recently. Over two years ago, he mentioned a powerful looping structure that allowed for recursion. Now he has found a use for it.

Loop Entry

Joe Gottman wanted to execute a closure on every loop entrance, not upon every iteration. He thought ENTER happened only once ever, but it turns out that it will do what he wants.

pop %hash

Rod Adams wants to be able to pop a key value pair out of a hash. Others wondered what to use it for. Someone mentioned an OrderedHash.

Higher Order Operators

Timothy Nelson wanted meta-operators. Larry gave him the full Unicode character set with which to define them. Tim was happy.

None and Nor Delimiters

Thomas Sandlaß suggested using \ as a none junction delimiter. He then extended this idea to provide a logical nor, \\. Autrijus suggested ! for none. There was some argument about whether nor deserved such Huffmanization. Also, I think that the difference between // and \\ would continually escape me. I have enough trouble writing code to deal with Windows filesystems.

Kudos to Autrijus

Damian proudly welcomed Autrijus to the ranks of the last-nameless-ones. He also lauded Autrijus' amazing work at forcing a lazy language to pull a lazier one. I think we all agree.

Containers vs. Object References

Rod Adams wondered whether there was a litmus test that could determine if something deserved its own sigil. The answer appears to be mostly history. Larry suggested a simplistic way to create new sigils, although it would not provide interpolation. I think a blessed method for defining new sigils which do interpolate and provide some sort of type constraint and context would be really nifty. Also I want a million dollars in small, non-consecutive, unmarked bills. If you have either, please mail it to me.

Printing True

Autrijus wondered about true and false. Are they just 1 and 0? #t and #f? Larry answered bool::true and bool::false, but true and false will suffice when there is no ambiguity.

Quoted => LHS

Juerd wondered if => auto-quotes its left hand side. It does.

"@" x 75 ~ $zap =?= ("@" x 75) ~ $zap

Juerd mistakenly thought that ~ bound more tightly than x. Only unary ~ binds that tightly, so he is safe.

Getting the key|value of a Pair

Steve Peters wondered how he could get the key or value from a pair. It turns out that the .key and .value method does what he wants until some twisted soul overrides them.

Junctions and Auto-threading

By far the longest topic this week was junctions. Some people worry that their auto-threaded behavior will cause plagues to ravage the earth and novices programmers to go blind. Others feel that without it Perl 6 will be a language suitable only to pond scum and cobol programmers. While one side believes that auto-threading repetition of sid effects will crash any database that interacts with Perl 6, the other side believes that requiring extra pragmas to unlock their full power will prevent junctions from curing cancer. Either way someone will be unhappy. It looked like the pendulum was swinging towards auto-threading, but its chief proponent will be away next week, so who knows if it can survive undefended. My favorite suggestion in all of this was to make Perl 6 a pure functional language and introduce monads.

Nullary vs. Non-slurpy

Autrijus found it distressing that he had to make an empty signature slurpy to make his quicksort sort quickly. Larry assured him that it was okay. This also led to a question of MMD tie-breaking. Remember to attach an is default if you want your ties broken!

Pairs

Steve Peters had a few more questions about pairs, which made Osfameron notice that he could use pairs like Lisp's dotted pairs. Larry admitted that he could and had hoped that no one would notice.

Octal Strings

Luke Palmer wondered which strings would change radix and to what. The consensus was that something needed to change, but no one is sure what.

Perl 6 Compilers

Pugs 6.0.8

Pugs continues to approach 2 * π with the release of 6.0.5. Later Autrijus decided that he should release more stuff and put out 6.0.8. It has many new features and a much improved test suite. You should contribute tests. All the cool people are.

List of builtins

Autrijus wondered if there was a list of builtins anywhere. Patrick pointed him to his start.

Perl 6 Now

Autrijus wondered if the unit tests for Perl 6 Now were available. Scott Walters told him that they were public domain. Autrijus gleefully sicced a small, ugly dog on them.

Pugs tests

??? proved that he was cool by providing Pugs tests. So did Steve Peters, Benjamin Smith, and Stevan Little. I bet you wish you were cool too.

Pugscode.org

Autrijus registered pugscode.org and populated it.

%*ENV in pugs

Rafael Garcia-Suarez wanted access to %*ENV, so he added it with tests.

Parrot

GNU R

Yves Breitmoser wants to improve the support for GNU R in Perl. Aaron Sherman suggested using Parrot as a back-end for Parrot, which would allow any language that targets Parrot to use R.

make html

Markus Amslser submitted a patch to fix make html. Leo applied it.

Win32 bind(accept(listen()))

Markus Amslser submitted a patch adding bind, listen, and accept on Win32. Leo applied it.

Commit Privileges

Bernhard Schmalhofer received commit privileges in recognition of his many high quality patches. Congrats Bernhard!

Tiny Webserver

Markus Amslser also provided a patch to allow his tiny webserver to run on Linux. Leo applied it.

FreeBSD Build Failure

Adriano Ferreira fixed some problems with the FreeBSD build. Leo applied the patch.

Parrot 0.1.2

Will Coleda continued pushing for a 0.1.2 release. Leo told him that there would be a release after the German Perl Workshop, which should be ending soon.

Linux PPC

Leo attempted to fix the longstanding issues with Linux PPC configure.

Failing Python Tests

Andy Dougherty noticed failures on some of the dynclasses/py*.t tests. He is working with Leo on providing enough information to solve the problem.

PyPy

Christian Tismer posted to the list informing everyone about the PyPy sprint at Washington, DC's PyCon 2005. They are eager to compare notes with other developers.

string_init and ICU data dir

Ron Blaschke noticed that Parrot choked on an empty ICU data dir. Leo fixed it.

Win32 Parrot

Ron Blaschke wants to improve Parrot building on Win32. He has offered to add the necessary PARROT_API macros and do other footwork.

Tail Calls

Bob Rogers wants to make tail calls without knowing the last call's return values. Currently he just lies to IMCC to make it work. Leo pointed him to interpinfo .INTERPINFO_CURRENT_CONT and also suggested adding a tail_call op.

The Usual Footer

Posting via the Google Groups interface does not work. To post to any of these mailing lists please subscribe by sending email to perl6-internals-subscribe@perl.org, perl6-language-subscribe@perl.org, or perl6-compiler-subscribe@perl.org. If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send feedback to ubermatt@gmail.com.

Perl and Mandrakelinux

Perl programmers have a special reason for choosing Mandrakelinux as their desktop operating system. Mandrakelinux uses Perl for dozens of the graphical "value added" utilities included with the distribution, including much of the Mandrakelinux Control Center. I asked Mandrakelinux for an interview with a top Perl contributor and they sent Rafael Garcia-Suarez my way. Besides being heavily involved with Perl at Mandrakesoft, Rafael is also the pumpking for the Perl 5.10 release. Rafael answered my questions about using Perl for GUI programming and how he balances his day job with being pumpking.

O'Reilly Network: Briefly tell us about the Perl work you do for Mandrakelinux.

Rafael: My main responsibility is to maintain and enhance the command-line tool urpmi (and its GUI counterpart rpmdrake), which are the Mandrakelinux equivalent of Debian's apt or Fedora's yum; that is, fetching RPMs and their dependencies and installing or upgrading them.

This job extends to whatever pertains to installing RPMs; that means that I also participate in enhancing Mandrakelinux's installer. All those tools are written in Perl.

Besides this, I also maintain the RPM of perl itself and of a load of CPAN modules for Mandrakelinux.

ORN: Perl is uncommon choice for graphical programming, yet Mandrakelinux has used Perl for over 50 graphical applications. Many of these tools are specific to Mandrakelinux, adding value to the distribution. What can you tell us about why Mandrakelinux uses Perl for this important role?

Rafael: Not all tools were always written in Perl. However using consistently a same language allows to share and reuse libraries across all tools, be it the perl/rpmlib bindings or custom graphical toolboxes. Thus, for example, the OS installer shares code with urpmi and rpmdrake. A scripting language was preferred because of rapidity of development and ease of debug -- attempts at writing rpmdrake in C were painful, although that was before I was hired by Mandrakesoft. Perl was a natural choice since there were already very good in-house skills for it.

Editors note: Recently, the Linspire distributionexemplified the use of dynamic languages to bring a graphical application to market quickly. Their Lsongs nd Lphoto programs use Python.

ORN: Could you give a specific example of where Perl has made a noticeable difference in shortening development time?

Rafael: I think that using a scripting language in general shortens the development time, notably due to the shorter write code / compile / test / debug cycle. However perl is particularly useful due to the high number of development modules available on CPAN. For example running the OS installer under Devel::Trace produces lots of logs, but is tremendously helpful to trace obscure bugs. You can't do this in C without adding printfs everywhere and recompiling the whole stuff.

ORN: What tools does Mandrakelinux use for automated testing of graphical Perl applications?

Rafael: Er, interns ?

More seriously, there is no automated testing for GUIs. Automated testing of such applications raises several difficult problems, since they often modify a system's configuration or necessitate some specific hardware (and I'm not even speaking of the OS installer GUI).

Writing more unit tests is definitively something I want to do in the future, however; it would be very useful to have complete sets of regression tests for the urpmi dependency solver, for example.

ORN: What has been the response of the Perl community to Mandrakelinux Perl-based tools, especially in terms of contributing patches back to your organization?

Rafael: The people who send patches for the tools are mostly interested in improving the distribution they use; even if they might belong to the Perl community as well, their point of view is the one of a Mandrakelinux user. That's one of the reasons why the tools have little visibility outside the MDK community.

Another reason is that there never was a strong motivation in Mandrakesoft for splitting the libraries in what is and is not MDK-specific, and to write clear and comprehensive documentation: both need efforts, don't pay immediately and are likely to be postponed when you have deadlines.

However, the CVS repository in which the tools' source code is kept is openly accessible; some contributors (i.e. non Mandrakesoft employees) have been granted commit access to it. We now need to make the learning curve softer.

ORN:What advice do have for Perl programmers interested in contributing to the Perl-based Mandrakelinux utilities? Any helpful hints for getting started?

Rafael: As with any open-source project, if you want to learn how it's done and to contribute, use the latest version available. In the case of Mandrakelinux, that would be "cooker", the development distribution. Subscribe to the mailing list, become familiar with the tools, have a checkout from the CVS repository, get yourself a Bugzilla account and don't be afraid to ask questions. Learning to build RPMs, at least to be able to rebuild the RPMs of the tools, would be helpful too. Those questions are covered in the wiki; a good page to begin with is:

http://qa.mandrakesoft.com/twiki/bin/view/Main/HowTo

ORN: What challenges have you faced maintaining Perl as a core part of the operating system, with so many key utilities depending on it?

Rafael: There are two kinds of challenges: spatial and temporal, would I say.

First, spatially, you have to devise how to split the standard Perl distribution on smaller packages ("perl-base" for the essentials, "perl" for the rest of modules, "perl-devel" for Perl development tools and "perl-doc" for, well, perldoc itself and the standard documentation.) This split is not arbitrary. When you maintain a core tool like urpmi, which is essential to system administration, you don't want it to require too many Perl modules, or even too many core modules. (The same goes for the installer, that must not take all the space on the installation CDs). So perl-base contains the modules used by urpmi, and urpmi doesn't use modules that are in perl but not in perl-base.

ORN: Perl has a reputation for being "slow" when used for graphical programming. How is that addressed in Mandrakelinux applications?

Rafael: I think that Perl doesn't deserve this reputation, only some Perl programs do ! The MDK tools use the perl-Gtk2 bindings (mostly for historical reasons, the Qt bindings weren't mature enough when the development of those tools started); and since they're pretty close to the C lib, performance is very acceptable.

Did you know that the game Frozen Bubble, written by a former Mandrakesoft employee, is implemented in Perl ? It's not anywhere near slow. Actually people are often surprised to learn that Frozen Bubble or the MDK tools are written in Perl, since they don't give the impression of slowness generally associated to scripted GUIs.

In fact, it appears that the speed bottlenecks of the MDK tools are, like in other programs, data processing, not display.

ORN: You've recently become the pumpking for Perl 5.10. How does your this interact with your day job and how do you balance the two positions?

Rafael: I was deeply involved in the development of Perl 5 before that, taking time to review and apply patches and so on. So I was mostly working on it on evenings and week-ends. I can now work on Perl 5 during my dayjob, since Mandrakesoft allows its developers to work on free projects they like part time. However, I'd point out that my day job is a bit special, since, contrary to proprietary projects, I'm always in contact with its community of users, via mail, IRC or other internet-based media (and even sometimes in real life.) Thus I'm sometimes led to work during my free time as well... In other words the frontier between day job and other open source development is blurred. In both cases things have to be done. The only difference is that for the day job thing, I have deadlines, and they pay me for it.

ORN: Do you have a favorite Perl module that more people should know about?

Rafael: I don't really know, I learn new modules mostly by hanging around in places where the cool kids discover them before me -- mailing lists, mongers, use.perl. I use B::Concise all the time, but I suspect it's not useful for people who are not familiar with the internals of perl. Also, recently, I found encoding::warnings useful for debugging Unicode-related bugs.

Building a 3D Engine in Perl, Part 3


This article is the third in a series aimed at building a full 3D engine in Perl. The first article started with basic program structure and worked up to displaying a simple depth-buffered scene in an OpenGL window. The second article followed with a discussion of time, view animation, SDL events, keyboard handling, and a nice chunk of refactoring.

Editor's note: see also the next article in the series, profiling your application.

Later in this article, I'll discuss movement of the view position, continue the refactoring work by cleaning up draw_view, and begin to improve the look of our scene using OpenGL lighting and materials. Before I cover that, your feedback to the previous articles has included a couple of common requests: screenshots and help with porting issues. If you're having problems running SDL_Perl and the sample code from these articles on your system, or might be able to help the Mac OS X and Win32 readers, take a look at the next section. Otherwise, skip down to the Screenshots section, where the main article begins.

Known Porting Issues

General

Some versions of SDL_Perl require that the program load SDL::Constants to recognize SDL_QUIT and other constants. As this change should be transparent to other users, I have merged that into the latest version of the sample code, retroactive to the first use of an SDL constant.

FreeBSD

See the suggestions at the beginning of the second article.

Mac OS X

I spent some time research the porting issues on Mac OS X but am as yet unable to figure out a simple procedure for building SDL_Perl from scratch. Recent emails on the sdl-devel mailing list seem to indicate that Mac OS X builds for recent SDL_Perl sources are problematic right now, but older releases seem to be even worse. There have been some packaging attempts in the past, but none that I have found so far install a fully configured set of SDL_Perl libraries into the system perl. I'm no Mac porting expert, so I appreciate any help on this; please post a comment in this month's article discussion if you have a suggestion or solution.

Slackware

According to comments by Federico (ironfede) in last month's article discussion, Slackware ships with a version of SDL_Perl that requires SDL::Constants. This is not an issue for the current version of the sample code, which I fixed as mentioned above in the General issues paragraph.

Win32

Win32 porting went as did Mac OS X porting. I was quite excited when chromatic pointed me to some old Win32 PPM packages, but sadly they don't include a working version of SDL::OpenGL. Building manually was "interesting" at best, as I have no access to a Microsoft compiler and precious little experience using gcc under Win32. As with the Mac folks, I appreciate any help from the readers. Please post a comment in this month's article discussion if you have a suggestion or solution for your fellows.

Screenshots

Thankfully, screenshots are much easier to handle than porting issues. I'd like the user to be able to take a screenshot whenever desired. The obvious way to accomplish that is to bind the screenshot action to a key; I chose function key F4 at random. First I added it to the bind hash:

        bind   => {
            escape => 'quit',
            f4     => 'screenshot',
            left   => '+yaw_left',
            right  => '+yaw_right',
            tab    => '+look_behind',
        }

The new key must have an action routine, so I altered that lookup hash as well:

    $self->{lookup}{command_action} = {
          quit         => \&action_quit,
          screenshot   => \&action_screenshot,
        '+yaw_left'    => \&action_move,
        '+yaw_right'   => \&action_move,
        '+look_behind' => \&action_move,
    };

I need to wait until drawing completes for the entire scene before I can take a snapshot, but event processing happens before drawing begins. To work around this, I set a state variable marking that the user has requested a screenshot, rather than perform the screenshot immediately:

sub action_screenshot
{
    my $self = shift;

    $self->{state}{need_screenshot} = 1;
}

The code checks this state variable in a new line at the end of end_frame, after the drawing has completed and it has synced the screen with the image written into OpenGL's color buffer:

sub end_frame
{
    my $self = shift;

    $self->{resource}{sdl_app}->sync;
    $self->screenshot if $self->{state}{need_screenshot};
}

The screenshot routine is surprisingly short but dense:

sub screenshot
{
    my $self = shift;

    my $file = 'screenshot.bmp';
    my $w    = $self->{conf}{width};
    my $h    = $self->{conf}{height};

    glReadBuffer(GL_FRONT);
    my $data = glReadPixels(0, 0, $w, $h, GL_BGR,
                            GL_UNSIGNED_BYTE);
    SDL::OpenGL::SaveBMP($file, $w, $h, 24, $data);

    $self->{state}{need_screenshot} = 0;
}

The routine starts by specifying a filename for the screenshot and gathering the width and height of the screen. The real work begins with the call to glReadBuffer. Depending on the OpenGL driver, the hardware, and a number of advanced settings, OpenGL may have provided several color buffers in which to draw and read images. In fact, the default behavior on most systems is to draw onto one buffer, known as the back buffer, and display a separate buffer, known as the front buffer. After completing the drawing for each frame, the SDL::App::sync call moves the image from the back buffer to the front buffer so the user can see it. Behind the scenes, OpenGL generally handles this in one of two different ways, depending on the underlying implementation. Software OpenGL implementations, such as Mesa, copy the data from the back buffer to the front buffer. Hardware-accelerated systems can swap internal pointers so that the back buffer becomes the front buffer and vice versa. As you can imagine, this is much faster.

This extra work brings a great benefit. Without double buffering, as soon as one frame completes, the next frame immediately clears the screen to black and starts drawing again from scratch. Depending on the relative speed difference between the user's monitor and the application, this would probably appear to the user as a flickering, dark, perpetually half-drawn scene. With double buffering, this problem is almost gone. The front buffer shows a solid stable image while all of the drawing is done on the back buffer. Once the drawing completes, it takes at most a few milliseconds to sync up and start displaying the new frame. To the human eye, the animation appears solid, bright, and (hopefully) smooth.

In this case, I want to make sure that I take a screenshot of exactly the same image the user sees, so I tell OpenGL that I want to read the image in the front buffer (GL_FRONT).

At this point, it's safe to read the image data into a Perl buffer in the proper format. The first four arguments to glReadPixels specify the lower-left corner and size of the sub-image to read. The next two arguments together tell OpenGL what format I would like for the data. I specify that I want to read the entire window and that I want the data in the correct format for a BMP file--one unsigned byte for each of the red, green, and blue color channels for each pixel, but in reverse order.

Once I have the data from OpenGL I use the SDL_Perl utility routine SaveBMP to save the image into a file. The arguments are the filename, image width, image height, color depth (24 bits per pixel), and data buffer. Finally, the routine resets the need_screenshot state flag and returns.

At this point you should be able to take a screenshot each time you press the F4 key. Of course, I'd like to show several screenshots during this article as the code progresses. The current code overwrites the previous screenshot file every time I request a new one. Because I number each runnable version of the code, I used a quick workaround resulting in a different screenshot filename for each code step. I first load one of the core Perl modules to strip directories from a path:

use File::Basename;

Then I use the filename of the script itself as part of my screenshot filename:

    my $file = basename($0) . '.bmp';

This may be all you need for your application, or you may want to add some code to number each file uniquely. This code is enough to fix my problem, so I've left the more powerful version as an exercise for the reader.

Here then is the first screenshot:

The observant reader will notice that this image is not a BMP file; it's a PNG image, which is both much smaller than a BMP and more friendly to web standards. There are many tools available that can perform this conversion. Any good image editor can do it. In this case that's overkill--I instead used the convert program from the ImageMagick suite of utilities:

convert step042.bmp step042.png

Moving the Viewpoint

That view is more than a tad overplayed. The user can't even move the viewpoint to see the back or sides of the scene. It's time to change that. I started by defining some new key bindings:

        bind   => {
            escape => 'quit',
            f4     => 'screenshot',
            a      => '+move_left',
            d      => '+move_right',
            w      => '+move_forward',
            s      => '+move_back',
            left   => '+yaw_left',
            right  => '+yaw_right',
            tab    => '+look_behind',
        }

I then updated the command_action lookup hash to handle these as movement keys:

    $self->{lookup}{command_action} = {
          quit          => \&action_quit,
          screenshot    => \&action_screenshot,
        '+move_left'    => \&action_move,
        '+move_right'   => \&action_move,
        '+move_forward' => \&action_move,
        '+move_back'    => \&action_move,
        '+yaw_left'     => \&action_move,
        '+yaw_right'    => \&action_move,
        '+look_behind'  => \&action_move,
    };

init_view needs to initialize two more velocity components and matching deltas:

    $self->{world}{view} = {
        position    => [6, 2, 10],
        orientation => [0, 0, 1, 0],
        d_yaw       => 0,
        v_yaw       => 0,
        v_forward   => 0,
        v_right     => 0,
        dv_yaw      => 0,
        dv_forward  => 0,
        dv_right    => 0,
    };

action_move needs a new movement speed to match the existing yaw speed and some additions to %move_update:

    my $speed_move       = 5;
    my %move_update      = (
        '+yaw_left'     => [dv_yaw     =>  $speed_yaw ],
        '+yaw_right'    => [dv_yaw     => -$speed_yaw ],
        '+move_right'   => [dv_right   =>  $speed_move],
        '+move_left'    => [dv_right   => -$speed_move],
        '+move_forward' => [dv_forward =>  $speed_move],
        '+move_back'    => [dv_forward => -$speed_move],
        '+look_behind'  => [d_yaw      =>  180        ],
    );

So far, the changes are mostly hash updates instead of procedural code; that's a good sign that the existing code design has some more life left. When conceptually simple changes require significant code modification, especially special cases or repetitive blocks of code, it's time to look for a refactoring opportunity. Thankfully, these changes are in initialization and configuration rather than special cases.

One routine that requires a good bit of new code is update_view. I added these lines to the end:

    $view->{v_right}        += $view->{dv_right};
    $view->{dv_right}        = 0;
    $view->{v_forward}      += $view->{dv_forward};
    $view->{dv_forward}      = 0;

    my $vx                   =  $view->{v_right};
    my $vz                   = -$view->{v_forward};
    $view->{position}[0]    += $vx * $d_time;
    $view->{position}[2]    += $vz * $d_time;

That routine is beginning to look a bit repetitious and has several copies of very similar lines of code, so it goes on the list of places to refactor in the future. There are not yet enough cases to make the best solution obvious, so I'll hold off for a bit.

The new code starts by applying the new velocity deltas in the same way that it updates v_yaw earlier in the routine. It converts the right and forward velocities to velocities along the world axes by noting that the view starts out with "forward" parallel to the negative Z axis and "right" parallel to the positive X axis. It then multiplies the X and Z velocities by the time delta to arrive at a position change, which it adds into the current view position.

This version of the code works fine as long as the user doesn't rotate the view. When the view rotates, "forward" and "right" don't match the new view directions. They still point down the -Z and +X axes respectively, which can prove very disorienting for high rotations. The solution is a bit of trigonometry. The idea is treat the initial X and Z velocities as components of the total velocity vector, and rotate that vector through the same angle that the user rotated the view:

    my $vx                   =  $view->{v_right};
    my $vz                   = -$view->{v_forward};
    my $angle                = $view->{orientation}[0];
    ($vx, $vz)               = rotate_xz($angle, $vx, $vz);
    $view->{position}[0]    += $vx * $d_time;
    $view->{position}[2]    += $vz * $d_time;

The two middle lines are the new ones. They call rotate_xz to do the vector rotation work and then set $vx and $vz to the returned components of the rotated velocity vector. rotate_xz is:

sub rotate_xz
{
    my ($angle, $x, $z) = @_;

    my $radians = $angle * PI / 180;
    my $cos     = cos($radians);
    my $sin     = sin($radians);
    my $rot_x   =  $cos * $x + $sin * $z;
    my $rot_z   = -$sin * $x + $cos * $z;

    return ($rot_x, $rot_z);
}

After converting the angle from degrees to radians, the code calculates and saves the sine and cosine of the angle. It then calculates the rotated velocity components given the original unrotated components. Finally, it returns the rotated components to the caller.

I'll skip the derivation here (you're welcome), but if you're curious about how and why this calculation performs a rotation, there are numerous books that explain the wonders of vector mathematics in amazing detail. O'Reilly's Physics for Game Developers, by David M. Bourg, includes a high-level discussion of rotation. Charles River Media's Mathematics for 3D Game Programming & Computer Graphics, by Eric Lengyel, includes a deeper discussion though I, for one, have college math flashbacks every time I read it. Speaking of which, any college textbook on linear algebra should include as much detail as you desire.

This code requires a definition for PI, provided by the following line near the top of the program, right after requesting warnings from Perl:

use constant PI => 4 * atan2(1, 1);

The constant module evaluates possibly complex calculations during the compile phase and then converts them into constants at runtime. The above calculation takes advantage of a standard trig identity to derive a value for PI accurate to as many digits as the system can deliver.

update_view now does the right thing, no matter what angle the view is facing. It doesn't take long to find a more interesting view:

Let There Be Lighting!

Okay, so maybe that's not much more interesting, admittedly. This scene needs a little mood lighting instead of the flat colors I've used so far (especially because they make it hard to see the shape of each object clearly). As a first step, I turned on OpenGL's lighting system with a new line at the end of prep_frame:

    glEnable(GL_LIGHTING);

Far from lighting the scene, the view is now almost black. If you look very carefully and your monitor and room lighting are forgiving, you should be able to just make out the objects, which are very dark gray on the black background. In order to see anything, I must enable both GL_LIGHTING and one or more lights to provide light to the scene. Without a light, the objects are dark gray instead of true black because OpenGL, by default, applies a very small amount of light to the entire scene, known as ambient light. To show the objects more brightly, I turned on the first OpenGL light with another new line at the end of prep_frame:

    glEnable(GL_LIGHT0);

Now the objects are brighter, but they're still just gray. When calculating colors with lighting enabled, OpenGL uses a completely different set of parameters from the colors used when lighting is disabled. Together these new parameters make up a material. Complex interactions between the parameters that make up a material can result in very interesting color effects, but in this case, I'm not trying to create a complex effect. I want my objects to have their old colors back without worrying about the full complexity that materials provide. Thankfully, OpenGL provides a way to state that the current material should default to the current color. To do this, I add yet another line to the end of prep_frame:

    glEnable(GL_COLOR_MATERIAL);

At this point, the objects once again have color, but each of the faces is still the same shade rather than appearing to be lit by a single light source somewhere. The problem is that OpenGL does not know whether each face points toward or away from the light and, if so, by how much. The angle between the face and the light determines how much light falls on the surface and, therefore, how bright it should appear. It is possible to calculate the angle of each face in my scene from the location of its vertices, but this is not always the right thing to do (especially when dealing with curved surfaces), so OpenGL does not calculate this internally. Instead, the program needs to do the direction calculations and tell OpenGL the result, known as the normal vector.

Luckily, in draw_cube the faces align with the coordinate axes so that each face points down one of them (positive or negative X, Y, or Z). I don't have to do any calculation here, just tell OpenGL which normal vector to associate with each face:

sub draw_cube
{
    # A simple cube
    my @indices = qw( 4 5 6 7   1 2 6 5   0 1 5 4
                      0 3 2 1   0 4 7 3   2 3 7 6 );
    my @vertices = ([-1, -1, -1], [ 1, -1, -1],
                    [ 1,  1, -1], [-1,  1, -1],
                    [-1, -1,  1], [ 1, -1,  1],
                    [ 1,  1,  1], [-1,  1,  1]);
    my @normals = ([0, 0,  1], [ 1, 0, 0], [0, -1, 0],
                   [0, 0, -1], [-1, 0, 0], [0,  1, 0]);

    glBegin(GL_QUADS);

    foreach my $face (0 .. 5) {
        my $normal = $normals[$face];
        glNormal(@$normal);

        foreach my $vertex (0 .. 3) {
            my $index  = $indices[4 * $face + $vertex];
            my $coords = $vertices[$index];
            glVertex(@$coords);
        }
    }
    glEnd;
}

The new lines are the definition of the @normals array and the two lines at the top of the $face loop that select the correct normal for each face and pass it to OpenGL using glNormal.

The boxes are now shaded reasonably and it's clear that the light is coming from somewhere behind the viewer; the front faces are brighter than the sides. Unfortunately, the axes are now dark again:

I did not specify any normal for the axis lines because the concept doesn't make a whole lot of sense for lines or points. However, with lighting enabled, OpenGL needs a set of normals for every lit object, so it goes back to the current state and uses the most recently defined normal. For the very first frame this is the default normal, which happens to point towards the default first light, but for succeeding frames it will be the last normal set in draw_cube. The latter definitely does not point toward the light, and the axes end up dark.

I'd rather the axis lines didn't take part in lighting calculations at all and kept their original bright colors, regardless of any lighting (or lack thereof) in the scene. To do this, I removed the line that enables GL_LIGHTING in prep_frame and inserted two new lines near the top of draw_view:

sub draw_view
{
    glDisable(GL_LIGHTING);

    draw_axes();

    glEnable(GL_LIGHTING);

Now lighting is off before drawing the axis lines and back on afterward. The axis lines have bright colors again, but rotating the view exposes a new problem. When the view rotates, the direction of the light changes as well:

Because of the way that OpenGL calculates light position and direction, any lights defined before the view is set are fixed to the viewer like the light on a miner's helmet. To fix a light relative to the simulated world, define the light instead after setting the view. I removed the line enabling GL_LIGHT0 in prep_frame and moved it to the new routine set_world_lights:

sub set_world_lights
{
    glEnable(GL_LIGHT0);
}

I then updated draw_frame to call the new routine after setting the view:

sub draw_frame
{
    my $self = shift;

    $self->set_projection_3d;
    $self->set_view_3d;
    $self->set_world_lights;
    $self->draw_view;
}

Unfortunately, this doesn't work. OpenGL only updates its internal state with the light's position and direction when they change explicitly, not when the light is enabled or disabled. I've never set the light's parameters explicitly, so the original default still stands. This issue is easy to fix with another line in set_world_lights:

sub set_world_lights
{
    glLight(GL_LIGHT0, GL_POSITION, 0.0, 0.0, 1.0, 0.0);

    glEnable(GL_LIGHT0);
}

In one of the few OpenGL interface decisions that actively annoys me, the new line sets the direction of the light, not its position. OpenGL defines all lights as one of two types: directional or positional. OpenGL assumes directional lights are very far away so that anywhere in the scene the direction from the light to each object is effectively the same. Positional lights are nearer and OpenGL must calculate the direction from the light to every vertex of every object in the scene independently. As you can imagine, this is much slower, but produces more interesting lighting effects.

The key to choosing between these two types is the last parameter of the glLight call above. If this parameter is 0, the light is directional and the other three coordinates specify the direction from which the light comes. In this case, I've specified that the light should come from the +Z direction. If the last parameter is 1, then OpenGL makes the light positional and uses the other three coordinates to set the light's position within the scene. For now, I'll skip the gory details of what happens when a value other than 0 or 1 is used, but in short, the light will be positional and extra calculations determine the actual position used. Most of the time it's best to ignore that case.

You may wonder why I explicitly specified 0.0 and 1.0 instead of 0 and 1. This is a workaround for a bug in glLight in some versions of SDL_Perl when it is presented with integer arguments instead of floating-point arguments.

With this line added, the light now stays fixed in the world, even when the user moves and rotates the view:

A Lantern

Of course, sometimes a light connected to the viewer is exactly the intention. For example, perhaps the desired effect is for the player to hold a lantern or flashlight to light dark places. Both of these are localized light sources that light nearby objects quite a bit, but distant objects only a little. The primary difference between them is that a flashlight and certain types of lanterns cast light primarily in one direction, often in a cone. Most lanterns, torches, and similar light sources cast light in all directions (barring shadows from handles, fuel tins, and the like).

Non-directed light is a little simpler to implement, so I'll start with lantern light. I wanted the light rooted at the viewer's position, so I defined the light before setting the view:

sub draw_frame
{
    my $self = shift;

    $self->set_projection_3d;
    $self->set_eye_lights;
    $self->set_view_3d;
    $self->set_world_lights;
    $self->draw_view;
}

I refer to viewer-fixed lights as eye lights because OpenGL refers to the coordinate system it uses for lights as eye coordinates, and a light defined this way as maintaining a particular position "relative to the eye." Here's set_eye_lights:

sub set_eye_lights
{
    glLight(GL_LIGHT1, GL_POSITION, 0.0, 0.0, 1.0, 0.0);

    glEnable(GL_LIGHT1);
}

Here I set the second light exactly the same way I set the first. Note that it doesn't matter that I actually define the second light in my program before the first. Each OpenGL light is independently numbered and always keeps the same number, rather than acting like a stack or queue numbered by order of use.

Sadly, the new code doesn't seem to have any effect at all. In reality, there really is a new light shining on the scene--unlike GL_LIGHT0, which defaults to shining bright white, all of the other lights default to black and provide no new light to the scene. The solution is to set another parameter of the light:

sub set_eye_lights
{
    glLight(GL_LIGHT1, GL_POSITION, 0.0, 0.0, 1.0, 0.0);
    glLight(GL_LIGHT1, GL_DIFFUSE,  1.0, 1.0, 1.0, 1.0);

    glEnable(GL_LIGHT1);
}

The front faces of each object should appear considerably brighter. Moving around the scene shows that the eye light brightens a surface only dimly lit by the world light:

If you watch carefully, however, you'll notice that the lighting varies by the view rotation--not position. I defined the light as directional with the light coming from behind the viewer, rather than positional, with the light coming from the viewer directly. I hinted at the fix earlier--changing the GL_POSITION parameter as follows:

    glLight(GL_LIGHT1, GL_POSITION, 0.0, 0.0, 0.0, 1.0);

The light now comes from (0, 0, 0) in eye coordinates, right at the viewpoint. Moving around and rotating shows that this version has the intended effect.

The simulated lantern still shines as brightly on far-away objects as it does on near ones. A real lantern's light falls off rapidly with distance from the lantern. OpenGL can do this with another setting:

sub set_eye_lights
{
    glLight(GL_LIGHT1, GL_POSITION, 0.0, 0.0, 0.0, 1.0);
    glLight(GL_LIGHT1, GL_DIFFUSE,  1.0, 1.0, 1.0, 1.0);
    glLight(GL_LIGHT1, GL_LINEAR_ATTENUATION, 0.5);

    glEnable(GL_LIGHT1);
}

This case tells OpenGL to include a dimming term in its equations proportional to the distance between the light and the object. Physics-minded readers will point out that physically accurate dimming is proportional to the square of the distance, and OpenGL does allow this using GL_QUADRATIC_ATTENUATION. However, a host of factors (including the lighting equations that OpenGL uses and the non-linear effects of the graphics hardware, monitor, and human eye) make this more accurate dimming look rather odd. Linear dimming turns out to look better in many cases, so that's what I used here. It is also possible to combine different dimming types, so that the dimming appears linear for nearby objects and quadratic for distant ones, which you may find a better tradeoff. The 0.5 setting tells OpenGL how strong the linear dimming effect should be for my scene.

Moving around the scene, you should be able to see the relatively subtle dimming effect in action. Don't be afraid to leave it subtle instead of turning the dimming effect way up. Some moods call for striking lighting effects, while others call for lighting effects that the viewer notices only subconsciously. In some visualization applications, lighting subtlety is a great virtue, allowing the human visual system's amazing processing power to come to grips with a complex scene without being overwhelmed.

A Flashlight

I really happen to like the way a flashlight casts its cone of light, so I converted the omnidirectional light of the lantern to a directed cone. OpenGL refers to this type of light as a spotlight and includes several light parameters to define them. The first change is a new setting in set_eye_lights:

    glLight(GL_LIGHT1, GL_SPOT_CUTOFF, 15.0);

This sets the angle between the center of the light beam and the edges of the cone. OpenGL accepts either 180 degrees (omnidirectional) or any value between 0 and 90 degrees (from a laser beam to a hemisphere of light). In this case, I chose a centerline-to-edge angle of 15 degrees, making a nice 30-degree-wide cone of light.

This change indeed limits the cone of light, but also reveals an ugly artifact. Move to a point just in front of the left front corner of the white cube and rotate the view to pan the light across the yellow box. You'll see the light jump nastily from corner to corner, even disappearing entirely in between. Even when a corner is lit, the shape of the light is not very conelike:

OpenGL's standard lighting model only performs the lighting calculations at each vertex, interpolating the results in between. For models that have many small faces and a resulting high density of vertices, this works relatively well. It breaks down nastily in scenes containing objects with large faces and few vertices, especially when a positional light is close to an object. Spotlights make the problem even more apparent, as they can easily shine between two vertices without lighting either of them; the polygon then appears uniformly dark.

Ode to Rush

Advanced OpenGL functionality paired with recent hardware can solve this problem with per-pixel lighting calculations. Older hardware can fake it with light maps and similar tricks. Rather than using advanced functionality, I'll use a simpler method for improving the lighting, known as subdivisions. (Those of you scratching your heads over the Rush reference can now breathe a collective sigh of relief.) Subdivisions have their own problems, as I'll show later, but those issues explain a lot about the design of graphics APIs, so they're worth a look.

As the name implies, the basic idea is to subdivide each face into many smaller faces, each with its own set of vertices. For curved objects such as spheres and cylinders, this is essential so that nearby objects appear to curve smoothly. For objects with large flat faces, such as boxes and pyramids, this merely has the side effect of forcing the per-vertex lighting calculations to be done many times across each face.

Before I can use subdivided faces, I need to prepare by refactoring draw_cube:

sub draw_cube
{
    # A simple cube
    my @indices = qw( 4 5 6 7   1 2 6 5   0 1 5 4
                      0 3 2 1   0 4 7 3   2 3 7 6 );
    my @vertices = ([-1, -1, -1], [ 1, -1, -1],
                    [ 1,  1, -1], [-1,  1, -1],
                    [-1, -1,  1], [ 1, -1,  1],
                    [ 1,  1,  1], [-1,  1,  1]);
    my @normals = ([0, 0,  1], [ 1, 0, 0], [0, -1, 0],
                   [0, 0, -1], [-1, 0, 0], [0,  1, 0]);

    foreach my $face (0 .. 5) {
        my $normal = $normals[$face];
        my @corners;

        foreach my $vertex (0 .. 3) {
            my $index  = $indices[4 * $face + $vertex];
            my $coords = $vertices[$index];
            push @corners, $coords;
        }
        draw_quad_face(normal    => $normal,
                       corners   => \@corners);
    }
}

Instead of performing the OpenGL calls directly in draw_cube, it now calls draw_quad_face. For each large face it creates a new @corners array filled with the vertex coordinates of the corners of that face. It then passes that array and the face normal to draw_quad_face, defined as follows:

sub draw_quad_face
{
    my %args    = @_;
    my $normal  = $args{normal};
    my $corners = $args{corners};

    glBegin(GL_QUADS);
    glNormal(@$normal);

    foreach my $coords (@$corners) {
        glVertex(@$coords);
    }
    glEnd;
}

This function performs exactly the OpenGL operations that draw_cube used to do. I've also used a different argument-passing style for this routine than I have previously. In this case, I pass named arguments because I know that I will add at least one more argument very soon and that there's a pretty good chance I'll want to add more later. When the arguments to a routine are likely to change over time, and especially when callers might want to specify only a few arguments and allow the rest to take on reasonable defaults, named arguments are usually a better choice. The arguments can either be a hashref or a list stuffed into a hash. This time, I chose the latter method.

After refactoring comes testing, and a quick run showed that everything worked as expected. Safe in that knowledge, I rewrote draw_quad_face to subdivide each face:

sub draw_quad_face
{
    my %args    = @_;
    my $normal  = $args{normal};
    my $corners = $args{corners};
    my $div     = $args{divisions} || 10;
    my ($a, $b, $c, $d) = @$corners;

    # NOTE: ASSUMES FACE IS A PARALLELOGRAM

    my $s_ab = calc_vector_step($a, $b, $div);
    my $s_ad = calc_vector_step($a, $d, $div);

    glNormal(@$normal);
    for my $strip (0 .. $div - 1) {
        my @v = ($a->[0] + $strip * $s_ab->[0],
                 $a->[1] + $strip * $s_ab->[1],
                 $a->[2] + $strip * $s_ab->[2]);

        glBegin(GL_QUAD_STRIP);
        for my $quad (0 .. $div) {
            glVertex(@v);
            glVertex($v[0] + $s_ab->[0],
                     $v[1] + $s_ab->[1],
                     $v[2] + $s_ab->[2]);

            $v[0] += $s_ad->[0];
            $v[1] += $s_ad->[1];
            $v[2] += $s_ad->[2];
        }
        glEnd;
    }
}

The new routine starts by adding the new optional argument divisions, which defaults to 10. This specifies how many subdivisions the face should have both "down" and "across"; the actual number of sub-faces is the square of this number. For the default 10 divisions, that comes to 100 sub-faces for each large face, so each cube has 600 sub-faces.

The next line labels the corners in counterclockwise order. This puts corner A diagonally across from corner C, with B on one side and D on the other.

As the comment on the next line indicates, I've simplified the math considerably by assuming that the face is at least a parallelogram. With this simplification, I can calculate the steps for one division along sides AB and AD and use these steps to position every sub-face across the entire large face.

I can't just calculate the step as a simple distance to move, because I have no idea which direction each edge is pointing and wouldn't know which way to move for each step. Instead, I calculate the vector difference between the vertices at each end of the edge and divide that by the number of divisions. The code does the same calculation twice, so I've extracted it into a separate routine:

sub calc_vector_step
{
    my ($v1, $v2, $div) = @_;

    return [($v2->[0] - $v1->[0]) / $div,
            ($v2->[1] - $v1->[1]) / $div,
            ($v2->[2] - $v1->[2]) / $div];
}

Returning to draw_quad_face, it stores the vector steps in $s_ab (the step along the AB side) and $s_ad (the step along the AD side). Next it sets the current normal, which for a flat face remains the same across its entirety.

Finally, I can begin to define the sub-faces themselves. I've taken advantage of the OpenGL quad strip primitive to draw the sub-faces as a series of parallel strips extending from the AB edge to the CD edge. For each strip, I first need to calculate the location of its starting vertex. I know this is on the AB edge, so the code starts at A and adds an AB step for each completed strip. For the first strip, this puts the starting vertex at A. For the last strip, the starting vertex will be one step (one strip width) away from B. It initializes the current vertex @v with the starting vertex and will keep it updated as it moves along each strip.

It then begins a strip of quads with glBegin(GL_QUAD_STRIP). To define the strip, I've specified the locations of each pair of vertices across from each other along its length. For each pair, it uses the current vertex and a calculated vertex one step further along the AB direction. The code then moves the current vertex one step along the length of the strip (the AD direction). Once the strip is complete, it ends it with glEnd and loops again for the next strip.

All of this complexity makes quite a visual difference:

It's clear that the light has a definite shape to it, but the lighting is so jagged that it's distracting. One way to fix this is to increase the number of divisions, making smaller sub-faces. This requires a simple addition to the draw_quad_face call in draw_cube:

        draw_quad_face(normal    => $normal,
                       corners   => \@corners,
                       divisions => 30);

The result is quite a bit less jagged:

Unfortunately, the jaggies are smaller but still obviously there--and the closer the viewer is to an object the bigger they appear. There are also nine times as many sub-faces to draw (30/10 squared) and the program now runs considerably slower. If you're lucky enough to have a recent system with fast video hardware and don't notice the slowdown, use 100 or so for the number of divisions. You'll probably see it.

Softening the Edges

Clearly, increasing the number of subdivisions only goes so far to improve the rendering, while simultaneously costing dearly in performance. I'll try a different tack and go back to what I know about a flashlight. Most flashlights cast a beam that is brighter in the center than at the edge. (Some have a dark circle in the very center, but I'm ignoring that for now.) I can take advantage of this to create a more accurate image and also soften the large jaggies considerably. First, I backed out my change to the draw_quad_face call:

        draw_quad_face(normal    => $normal,
                       corners   => \@corners);

Then I changed one spotlight parameter for the flashlight in set_eye_lights and added another:

    glLight(GL_LIGHT1, GL_SPOT_CUTOFF,   30.0);
    glLight(GL_LIGHT1, GL_SPOT_EXPONENT, 80.0);

With the change to GL_SPOT_CUTOFF, I've widened the beam to twice its original angle. At the same time, I've told OpenGL to make it quite a bit dimmer at the edges using GL_SPOT_EXPONENT, hopefully hiding any jaggies. The new parameter has a somewhat confusing name that refers to the details of the equation that determines the strength of the off-center dimming effect. In a theme seen throughout the mathematics of computer graphics, the dimming is a function of the cosine of the angle between the center line and the vertex being lit. In fact, the dimming factor is the cosine raised to the exponent specified by GL_SPOT_EXPONENT. Why use the cosine of the angle? It turns out to be cheap to calculate--cheaper than calculating the angle itself--and also gives a nice smooth effect.

With luck, the new beam will appear about the same width to the eye as the old one:

Good enough. The image looks better without the massive performance strain of high subdivision levels.

Refactoring Drawing

There's still something not right, but it will take a few more objects in the scene to show it. draw_view is already a repetitive hardcoded mess and it's been on the "to be refactored" list for a while, so now seems a good time to clean it up before I add to the mess.

draw_view performs a series of transformations and state settings for each object drawn. I want to move to a more data-driven design, with each object in the simulated world represented by a data structure specifying the needed transformations and settings. Eventually, these structures may become full-fledged blessed objects, but I'll start simple for now.

I initialized the data structures in init_objects:

sub init_objects
{
    my $self = shift;

    my @objects = (
        {
            draw        => \&draw_axes,
        },
        {
            lit         => 1,
            color       => [ 1, 1,  1],
            position    => [12, 0, -4],
            scale       => [ 2, 2,  2],
            draw        => \&draw_cube,
        },
        {
            lit         => 1,
            color       => [ 1, 1, 0],
            position    => [ 4, 0, 0],
            orientation => [40, 0, 0, 1],
            scale       => [.2, 1, 2],
            draw        => \&draw_cube,
        },
    );

    $self->{world}{objects} = \@objects;
}

Each hash includes the arguments to the various transformations to apply to it, along with a reference to the routine that actually draws the object and a flag indicating whether the object should be subject to OpenGL lighting. The object array then becomes a new part of the world hash for easy access later.

I called this routine at the end of init as usual:

    $self->init_objects;

I also replaced draw_view with a version that interprets the data into a series of OpenGL calls:

sub draw_view
{
    my $self    = shift;

    my $objects = $self->{world}{objects};

    foreach my $o (@$objects) {
        $o->{lit} ? glEnable (GL_LIGHTING)
                  : glDisable(GL_LIGHTING);

        glColor(@{$o->{color}})        if $o->{color};

        glPushMatrix;

        glTranslate(@{$o->{position}}) if $o->{position};
        glRotate(@{$o->{orientation}}) if $o->{orientation};
        glScale(@{$o->{scale}})        if $o->{scale};

        $o->{draw}->();

        glPopMatrix;
    }
}

The new routine iterates over the world object array, performing each requested operation. It either skips or defaults any unspecified values. First up is the choice to enable or disable GL_LIGHTING, followed by setting the current color if requested. The code next checks for and applies the usual transformations and finally, calls the object draw routine.

For simplicity and robustness, I've unconditionally wrapped the transformations and draw routine in a matrix push/pop pair rather than trying to detect whether they need the push and pop. OpenGL implementations tend to be highly optimized with native code, and any detection I did would be Perl. Chances are good that such an "optimization" would instead slow things down. This way, my code stays cleaner and even a misbehaving draw routine that performed transformations internally without cleaning up afterwards will not affect the next object drawn.

A quick test showed that this refactored version still worked. Now I could add a few more objects to demonstrate the remaining lighting issue. I specified several more boxes programmatically by inserting a new loop before the end of init_objects:

    foreach my $num (1 .. 5) {
        my $scale =   $num * $num / 15;
        my $pos   = - $num * 2;
        push @objects, {
            lit         => 1,
            color       => [ 1, 1,  1],
            position    => [$pos, 2.5, 0],
            orientation => [30, 1, 0, 0],
            scale       => [1, 1, $scale],
            draw        => \&draw_cube,
        };
    }

    $self->{world}{objects} = \@objects;
}

For each box, just two parameters vary: position and Z scale. I chose the position to set each box next to the last, progressing along the -X axis. The scale is set so that the height and width of each box remains the same, but the depths vary from very shallow for the first box to fairly deep for the last.

The loop specifies five boxes in total and begins by calculating the X position and Z scaling (depth) for the current box. The next few lines simply create a new hash for the new box and push it onto the object array.

Finally, there was one last change--the bright world light overwhelms the problematic effect from the flashlight. This is an easy fix; I commented out the line that enables it:

sub set_world_lights
{
    glLight(GL_LIGHT0, GL_POSITION, 0.0, 0.0, 1.0, 0.0);

#     glEnable(GL_LIGHT0);
}

By panning to the left across the scene until the viewpoint is in front of the new boxes, the problem becomes obvious:

The brightness of the lighting varies immensely depending on the depth of the box! This rather unintuitive outcome is an unfortunate side effect of how OpenGL must handle normals. A normal specifies the direction of the surface associated with a vertex. If a rigid object rotates, its surfaces rotate, so all of its normals must rotate as well. OpenGL handles this by transforming normal coordinates as it would vertex coordinates. This runs into trouble with any transformations other than translation and rotation. OpenGL calculations assume that normals are normalized (have unit length). Scaling the normal breaks this assumption and results in the effect seen above.

To fix this, I told OpenGL that normals may not have unit length and that OpenGL must normalize them before other calculations are performed. This is not the default behavior because of the performance cost of normalizing each vector. An application that can ensure normals are always unit length after transformation can keep the default and run a little faster. I want to allow arbitrary scaling of objects, so I enabled automatic normalization with another line at the end of prep_frame:

    glEnable(GL_NORMALIZE);

That fixed the problem:

With that bug killed, I reenabled the world light by uncommenting the glEnable line in set_world_lights:

sub set_world_lights
{
    glLight(GL_LIGHT0, GL_POSITION, 0.0, 0.0, 1.0, 0.0);

    glEnable(GL_LIGHT0);
}

Conclusion

During this article I've moved pretty quickly, covering screenshots, movement of the viewpoint, the beginnings of lighting in OpenGL, and subdivided faces for the boxes. Along the way, I took the chance to refactor draw_view into a more data-driven design and made the scene a little more interesting.

Unfortunately, these new changes have slowed things down quite a bit. OpenGL has several features that can improve performance considerably. Next time, I'll talk about one of the most powerful of these: display lists. I'll also introduce basic font handling and run with the performance theme by adding an FPS display to the engine.

Until next time, have fun and keep hacking!

This Week in Perl 6, Feb. 1 - 8, 2005

All~

Welcome to yet another summary in which I will undoubtedly confuse two homophones -- probably more than a few this week as I am a little tired. But perhaps the alien on my window or the vampire on my monitor will help straighten it all out.

Perl 6 Language

Auto-threading Generalization

Luke Palmer's thread about auto-threading seems to have wound down without much resolution, or at the very least without a syntax that I like.

Featherweight Perl 6

Autrijus Tang introduced Featherweight Perl 6, a side-effect-free subset of Perl 6. FP6 is the first step on a journey for Pugs to conquer the world.

Value Types vs. Implementation Types

Autrijus Tang became confused by types in his work on FP6. He asked some questions about Types on perl6-language, which told him to go to perl6-compiler. perl6-compiler told him that he should really be on perl6-language. Sorry for the runaround. The proper place for questions about language semantics is perl6-language, where he originally posted. Eventually all of that settled and someone even answered his question.

Logic Programming in Perl 6

Ovid asked how logic programming in Perl 6 would look. No answer came, but I suppose I can pick the low hanging fruit: as a limiting case you could always back out the entire Perl 6 grammar and insert that of Prolog.

The Leaning Tower of Numbers

Autrijus Tang asked what sort of number tower Perl 6 had. He was planning (and implemented) that of Scheme until he received an answer otherwise.

Some Quick Questions

Autrijus Tang asked a few questions about operators in specific contexts. He received a few answers, although I don't think they covered everything.

Getopt::Signature

Juerd wants to be able to give his script a general signature that magically parses command-line arguments. Many people thought it was a cool idea that should go in a module rather than in the core. This led to talk about creating a signature object to pass to this module so that it doesn't need macros. Sounds good to me.

S03 Inconsistency

Juerd found a nit to pick. There were no comments about whether to pick the first or second option, though...

Regex Precedence

Nicholas Clark wondered about adding a little operator precedence to regexes so that people could avoid a few common problems. It sounds like a good idea to me.

Pipedreams

Juerd wondered if he could mix = and ==> in a sane way. The answer appears to be no. Once you bring in ==> you should stick with it.

4 < $x < 2 == true?!?

Autrijus Tang wondered how junctions and chained comparators work. He found that under certain semantics, the answers can be very troubling. Larry agreed with Autrijus about the need to avoid unintuitive semantics.

Perl 6 Compiler

Pugs 6.0.0

Autrijus Tang released Pugs 6.0.0 after six days of creation. It has many cool features, whether a testament to the power of Haskell or the caffeine intake of Autrijus. (Although to be fair he skipped more version numbers then Java did in its jump to 5.0.)

Parrot

Gdbmhash Warning

Bernhard Schmalhofer fixed a problem where gdbmhash could make Configure produce a warning. Leo applied it.

PyNCI

Sam and Leo worked out the correct way to create NCI methods for Python. Leo ended by proposing a table to assist Parrot based on the current language, but no answer came for that idea.

Makefile Cleanup

Bernhard Schmalhofer cleaned up some of the makefiles and configure system. Leo applied the patch.

ParrotIOLayer* Const or Not

François Perrad provided a patch making the ParrotIOLayer* const in the API. Leo applied it, but Melvin warned that while this may be safe now, the API intended to allow mutability. I think for the moment it is still in though.

Win32 Parrot

Ron Blaschke helped Parrot back onto its feet in the Windows world.

Latest results

Parrot_load_bytecode Failure?

Ian Joyce wondered what would happen if Parrot_load_bytecode failed. The answer: an exception.

Reading Past EOF

Matt Diephouse found it annoying that reading past EOF gave an unhelpful error message. Leo fixed it.

FreeBSD Build Problems

Will Coleda found a build problem for Parrot on FreeBSD. Adriano Ferreira provided a workaround, but the build system needs to be smarter, too. On a side note, I want a picture of the BSD daemon (pitchfork included) with a pirate hat, an eye patch, and a parrot on its shoulder.

Gmake Requirement?

There was some confusion about what does and does not require gmake on FreeBSD. IMCC does not. ICU does. Fortunately, Dan's string stuff will make ICU optional and Parrot won't require it.

Locate_runtime_file with Absolute Paths

Jeff Horwitz noticed that Parrot_locate_runtime_file could segfault when playing with an absolute path. He put some work in to fix it.

Expected vpm.imc Output Error

Bernhard Schmalhofer fixed a failing benchmark test that just had slightly wrong output expectations. chromatic (whose uncapitalized name does not bother me) applied the patch.

Subversive Parrot

Ron Blaschke asked if there were plans to move Parrot to SVN. Many argued in favor; few argued against. No word from the powers-that-be.

Parrot 0.1.2?

Will Coleda wondered if the time for Parrot 0.1.2 was growing close. Leo pointed out some things that need fixes. Honestly, the real impediment is that we do not yet have a cool code name. I suggest kiwi, because everyone knows they really want to be Parrots.

Making an Array Out of an Undef

Ron Blaschke was surprised when an operation quietly turned an Undef into an array. Leo explained to him that this set of semantics was known and expected.

Solaris Issues

Andy Dougherty noticed that Parrot was failing some tests on Solaris. He tried to provide enough info for people to help him, but no one did.

Linux PPC Issues

Jeff Dik posted a patch that worked around a problem on Linux PPC. chromatic pointed out that there was a more correct patch already in RT. Jeff Dik slapped himself on the forehead and reminded himself to check RT first.

The Usual Footer

Posting via the Google Groups interface does not work. To post to any of these mailing lists please subscribe by sending email to . If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send feedback to

Perl Code Kata: Testing Databases

Testing code that uses a database can be tricky. The most common solution is to set up a test database with test data and run your tests against this. This, of course, requires bookkeeping code to keep your test database in the proper state for all your tests to run without adversely affecting one another. This can range from dropping and recreating the test database for each test, to a more granular adding and deleting at the row level. Either way, you are introducing non-test code into your tests that open up possibilities for contamination. Ultimately, because you have control over the environment in which your tests run, you can manage this despite the occasional headache.

The real fun only starts when you decide that you should release your masterpiece unto the world at large. As any CPAN author will tell you, it is absolutely impossible to control the environment other people will run your code in once you release it. Testing database code in such a hostile environment can be frustrating for both the module developer and the module installer. A common approach is to allow the user to specify the specific database connection information as either environment variables or command-line arguments, skipping the tests unless those variables are present. Another approach is to use the lightweight and very portable SQLite as your test database (of course, testing first that the user has installed SQLite). While these solutions do work, they can often be precarious, and in the end will increase the number of possible installation problems you, as module author, could face.

What is a module author to do?

DBD::Mock Testing Kata

This code kata introduces an alternate approach to testing database code, that of using mock-objects, and specifically of using the DBD::Mock mock DBI driver. Before showing off any code, I want to explain the basic philosophy of Mock Objects as well as where DBD::Mock fits in.

What are Mock Objects?

When writing unit tests, it is best to try to isolate what you are testing as much as possible. You want to be sure that not only are you only testing the code in question, but that a bug or issue in code outside what you are testing will not introduce false negatives in your tests. Unfortunately, this ideal of a completely decoupled design is just an ideal. In real-world practice, code has dependencies that you cannot remove for testing. This is where Mock Objects come in.

Mock Objects are exactly what they sound like; they are "mocked" or "fake" objects. Good polymorphic thought says that you should be able to swap out one object for another object implementing the same interface. Mock Objects take advantage of this by allowing you to substitute the most minimally mocked implementation of an object possible for the real one during testing. This allows you to concentrate on the code being tested without worrying about silly things, such as whether your database is still running or if there is a database available to test against.

Where Does DBD::Mock Fit In?

DBD::Mock is a mock DBI Driver that allows you to test code which uses DBI without needing to worry about the who, what, when, and where of a database. DBD::Mock also helps to reduce the amount of database bookkeeping code by doing away with the database entirely, instead keeping a detailed record of all the actions performed by your code through DBI. Of course, database interaction/communication is not only one way, so DBD::Mock also allows you to seed the driver with mock record sets. DBD::Mock makes it possible to fake most (non-vendor specific) database interaction for the purpose of writing tests. For more detailed documentation I suggest reading the DBD::Mock POD documentation itself.

Sample DBI Code

In the tradition of past Perl Code katas here is some simplified code to write your tests against. This code should be simple enough to understand, but also complex enough to show the real usefulness of DBD::Mock.

package MyApp::Login;

use DBI;

my $MAX_LOGIN_FAILURES = 3;

sub login {
  my ($dbh, $u, $p) = @_;
  # look for the right username and password
  my ($user_id) = $dbh->selectrow_array(
      "SELECT user_id FROM users WHERE username = '$u' AND password = '$p'"
  );
  # if we find one, then ...
  if ($user_id) {
      # log the event and return success      
      $dbh->do(
          "INSERT INTO event_log (event) VALUES('User $user_id logged in')"
      );
      return 'LOGIN SUCCESSFUL';
  }
  # if we don't find one then ...
  else {
      # see if the username exists ...
      my ($user_id, $login_failures) = $dbh->selectrow_array(
          "SELECT user_id, login_failures FROM users WHERE username = '$u'"
      );
      # if we do have a username, and the password doesnt match then
      if ($user_id) {
          # if we have not reached the max allowable login failures then 
          if ($login_failures < $MAX_LOGIN_FAILURES) {
              # update the login failures
              $dbh->do(qq{
                  UPDATE users 
                  SET login_failures = (login_failures + 1)
                  WHERE user_id = $user_id
              });
              return 'BAD PASSWORD';                  
          }
          # otherwise ...
          else {
              # we must update the login failures, and lock the account
              $dbh->do(
                  "UPDATE users SET login_failures = (login_failures + 1), " .
                  "locked = 1 WHERE user_id = $user_id"
              );                                                              
              return 'USER ACCOUNT LOCKED';
          }
      }
      else {
          return 'USERNAME NOT FOUND';
      }
  }
}

There are four distinct paths through this code, each one resulting in one of the four return messages; LOGIN SUCCESSFUL, BAD PASSWORD, USER ACCOUNT LOCKED, and USERNAME NOT FOUND. See if you can write tests enough to cover all four paths. Feel free to use Devel::Cover to verify this.

Armed with your knowledge of DBD::Mock, go forth and write tests! The next page describes DBD::Mock in more detail and gives some strategies for writing the appropriate tests. You should spend between 30 and 45 minutes writing the tests before continuing.

Tips, Tricks, and Suggestions

Because DBD::Mock is an implementation of a DBD driver, its usage is familiar to that of DBI. DBD::Mock is unique in its ability to mock the database interaction. The following is a short introduction to these features of DBD::Mock.

Fortunately, connecting to the database is the only part of your regular DBI code which needs to be DBD::Mock specific, because DBI chooses the driver based upon the dsn string given it. To do this with DBD::Mock:

my $dbh = DBI->connect('dbi:Mock:', '', '');

Because DBI will not actually connecting to a real database here, you need no database name, username, or password. The next thing to do is to seed the database driver with a result set. Do this through the mock_add_resultset attribute of the $dbh handle.

$dbh->{mock_add_resultset} = [
  [ 'user_id', 'username', 'password' ],
  [ 1, 'stvn', '****' ]
];

DBD::Mock will return this particular result set the next time a statement executes on this $dbh. Note that the first row is the column names, while all subsequent rows are data. Of course, in some cases, this is not specific enough, and so DBD::Mock also allows the binding of a particular SQL statement to a particular result set:

$dbh->{mock_add_resultset} = {
  sql     => "SELECT * FROM user_table WHERE username = 'stvn'",
  results => [[ 'user_id', 'username', 'password' ],
              [ 1, 'stvn', '****' ]]
};

Now whenever the statement SELECT * FROM user_table WHERE username = 'stvn' executes, DBD::Mock will return this result set DBD::Mock can also specify the number of rows affected for UPDATE, INSERT, and DELETE statements using mock_add_resultset as well. For example, here DBI will see the DELETE statement as having deleted 3 rows of data:

$dbh->{mock_add_resultset} = {
  sql     => "DELETE FROM session_table WHERE active = 0",
  results => [[ 'rows' ], [], [], []]
};

DBD::Mock version 0.18 introduced the DBD::Mock::Session object, which allows the scripting of a session of database interaction -- and DBD::Mock can verify that the session executes properly. Here is an example of DBD::Mock::Session:

$dbh->{mock_session} = DBD::Mock::Session->new('session_reaping' => (
  {
  statement => "UPDATE session_table SET active = 0 WHERE timeout < NOW()",
  results  => [[ 'rows' ], [], [], []]
  },
  {
  statement => "DELETE FROM session_table WHERE active = 0",
  results  => [[ 'rows' ], [], [], []]
  }  
));

The hash reference given for each statement block in the session should look very similar to the values added with mock_add_resultset, with the only difference in the substitution of the word statement for the word sql. DBD::Mock will assure that the first statement run matches the first statement in the session, raising an error (in the manner specified by PrintError or RaiseError) if not. DBD::Mock will then continue through the session until it reaches the last statement, verifying that each statement run matches in the order specified. You can also use regular expression references and code references in the statement slots of DBD::Mock::Session for even more sophisticated comparisons. See the documentation for more details of how those features work.

After you seed a $dbh with result sets, the next step is to run the DBI code which will use those result sets. This is just normal regular everyday DBI code, with nothing unique to DBD::Mock.

After all the DBI code runs, it is possible to then go through all the statements that have been executed and examine them using the array of DBD::Mock::StatementTrack objects found in the mock_all_history attribute of your $dbh. Here is a simple example of printing information about each statement run and the bind parameters used:

my $history = $dbh->{mock_all_history};
foreach my $s (@{$history}) {
  print "Statement  : " . $s->statement() . "\n" .
        "bind params: " . (join ', ', @{$s->bound_params()}) . "\n";
}

DBD::Mock::StatementTrack also offers many other bits of statement information. I refer you again to the DBD::Mock POD documentation for more details.

Now, onto the tests.

Solutions

The saying goes of Perl, "there is more than one way to do it", and this is true of DBD::Mock as well. The test code had four distinct paths through the code, and the test solutions will use each one to demonstrate a different technique for writing tests with DBD::Mock.

The first example is the LOGIN SUCCESSFUL path. The code uses the array version of mock_add_resultset to seed the $dbh and then examines the mock_all_history to be sure all the statements ran in the correct order.

use Test::More tests => 4;

use MyApp::Login;

my $dbh = DBI->connect('dbi:Mock:', '', '');
 
$dbh->{mock_add_resultset} = [[ 'user_id' ], [ 1 ]];
$dbh->{mock_add_resultset} = [[ 'rows' ], []];

is(MyApp::Login::login($dbh, 'user', '****'), 
   'LOGIN SUCCESSFUL', 
   '... logged in successfully');
 
my $history = $dbh->{mock_all_history};

cmp_ok(@{$history}, '==', 2, '... we ran 2 statements');

is($history->[0]->statement(), 
   "SELECT user_id FROM users WHERE username = 'user' AND password =
    '****'", '... the first statement is correct');

is($history->[1]->statement(), 
   "INSERT INTO event_log (event) VALUES('User 1 logged in')",
   '... the second statement is correct');

This is the simplest and most direct use of DBD::Mock. Simply seed the $dbh with an appropriate number of result sets, run the code, and then test to verify it called the right SQL in the right order. It doesn't come much simpler than that. This approach does have its drawbacks though, the most obvious being that there is no means of associating the SQL directly with the result sets (as would happen in a real database). However, DBD::Mock returns result sets in the order added, so there is an implied sequence of events, verifiable later with mock_all_history.

The next example is the USERNAME NOT FOUND path. The test code uses the hash version of mock_add_resultset to seed the $dbh and the mock_all_history_iterator to check the statements afterwards.

use Test::More tests => 4;

use MyApp::Login;

my $dbh = DBI->connect('dbi:Mock:', '', '');

$dbh->{mock_add_resultset} = {
  sql => "SELECT user_id FROM users WHERE username = 'user' 
       AND password = '****'", results => [[ 'user_id' ], 
	   [ undef ]]
};
$dbh->{mock_add_resultset} = {
  sql => "SELECT user_id, login_failures FROM users WHERE 
       username = 'user'", results => [[ 'user_id', 
	   'login_failures' ], [ undef, undef ]]
};

is(MyApp::Login::login($dbh, 'user', '****'), 
  'USERNAME NOT FOUND', 
  '... username is not found');

my $history_iterator = $dbh->{mock_all_history_iterator};

is($history_iterator->next()->statement(), 
   "SELECT user_id FROM users WHERE username = 'user' AND password = '****'",
   '... the first statement is correct');

is($history_iterator->next()->statement(), 
   "SELECT user_id, login_failures FROM users WHERE username = 'user'",
   '... the second statement is correct');

ok(!defined($history_iterator->next()), '... we have no more statements');

This approach allows the association of a specific SQL statement with a specific result sets. However, it loses the implied ordering of statements, which is one of the benefits of the array version of mock_add_resultset. You can check this manually using mock_all_history_iterator (which simply iterates over the array returned by mock_all_history). One of the nice things about using mock_all_history_iterator is that if the need arises to add, delete, or reorder your SQL statements, you don't need to change all the $history array indices in your test. It is also a good idea to check that only the two expected statements ran; do this by exploiting the fact that the iterator returns undefined values when it exhausts its contents.

The next example is the USER ACCOUNT LOCKED path. The test code uses the DBD::Mock::Session object to test this path. I recommend to set the $dbh to RaiseError so that DBD::Mock::Session will throw an exception if it runs into an issue.

use Test::More tests => 2;
use Test::Exception;

use MyApp::Login;

my $dbh = DBI->connect('dbi:Mock:', '', '', { RaiseError => 1, PrintError => 0 });

my $lock_user_account = DBD::Mock::Session->new('lock_user_account' => (
  {
      statement => "SELECT user_id FROM users WHERE username = 'user' AND 
	       password = '****'", results   => [[ 'user_id' ], [ undef]]
  },
  {
      statement => "SELECT user_id, login_failures FROM users WHERE 
	       username = 'user'", results   => [[ 'user_id', 'login_failures' ], 
	       [ 1, 4 ]]
  },
  {
      statement => "UPDATE users SET login_failures = (login_failures + 1), 
	  locked = 1 WHERE user_id = 1", results   => [[ 'rows' ], []]
  }
));

$dbh->{mock_session} = $lock_user_account;
my $result;
lives_ok {
    $result = MyApp::Login::login($dbh, 'user', '****')
} '... our session ran smoothly';
is($result, 
  'USER ACCOUNT LOCKED', 
  '... username is found, but the password is wrong, 
       so we lock the the user account');

The DBD::Mock::Session approach has several benefits. First, the SQL statements are associated with specific result sets (as with the hash version of mock_add_resultset). Second, there is an explicit ordering of statements (like the array version of mock_add_resultset). DBD::Mock::Session will verify that the session has been followed properly, and raise an error if it is not. The one drawback of this example is the use of static strings to compare the SQL with. However, DBD::Mock::Session can use other things, as illustrated in the next and final example.

The next and final example is the BAD PASSWORD path. The test code demonstrates some of the more complex possibilities of the DBD::Mock::Session object:

use Test::More tests => 2;
use Test::Exception;

use SQL::Parser;
use Data::Dumper;

use MyApp::Login;

my $dbh = DBI->connect('dbi:Mock:', '', '', { RaiseError => 1, PrintError => 0 });

my $bad_password = DBD::Mock::Session->new('bad_password' => (
{
  statement => qr/SELECT user_id FROM users WHERE username = \'.*?\' AND 
       password = \'.*?\'/, results   => [[ 'user_id' ], [ undef]]
},
{
  statement => qr/SELECT user_id, login_failures FROM users WHERE username = 
  \'.*?\'/, results   => [[ 'user_id', 'login_failures' ], [ 1, 0 ]]
},
{
  statement => sub { 
      my $parser1 = SQL::Parser->new('ANSI');
      $parser1->parse(shift(@_)); 
      my $parsed_statement1 = $parser1->structure(); 
      delete $parsed_statement1->{original_string};
      
      my $parser2 = SQL::Parser->new('ANSI');
      $parser2->parse("UPDATE users SET login_failures = 
	       (login_failures + 1) WHERE user_id = 1");
      my $parsed_statement2 = $parser2->structure(); 
      delete $parsed_statement2->{original_string};      
      
      return Dumper($parsed_statement2) eq Dumper($parsed_statement1);
  },
  results   => [[ 'rows' ], []]
}
));

$dbh->{mock_session} = $bad_password;

my $result;
lives_ok {
    $result = MyApp::Login::login($dbh, 'user', '****')
} '... our session ran smoothly';
is($result, 'BAD PASSWORD', '... username is found, but the password is wrong');

This approach uses DBD::Mock::Session's more flexible means of performing SQL comparisons. The first and second statements are compared using regular expressions, which alleviates the need to hardcode test data into the statement. The third statement uses a subroutine reference to perform the SQL comparison. As you may have noticed in the test code provided, the UPDATE statement for the BAD PASSWORD path used Perl's qq() quoting mechanism to format the SQL in a more freeform manner. This can create complexities when trying to verify the SQL using strings or regular expressions. The test here uses SQL::Parser to determine the functional equivalence of the test statement and the statement run in the code.

Conclusion

I hope this kata has illustrated that unit-testing DBI code does not have to be as difficult and dangerous as it might seem. Through the use of Mock Objects in general and specifically the DBD::Mock DBI driver, it is possible to achieve 100% code coverage of your DBI-related code without ever having touched a real database. Here is the Devel::Cover output for the tests above:

 ---------------------------- ------ ------ ------ ------ ------ ------ ------
 File                           stmt branch   cond    sub    pod   time  total
 ---------------------------- ------ ------ ------ ------ ------ ------ ------
 lib/MyApp/Login.pm            100.0  100.0    n/a  100.0    n/a  100.0  100.0
 Total                         100.0  100.0    n/a  100.0    n/a  100.0  100.0
 ---------------------------- ------ ------ ------ ------ ------ ------ ------

See Also --

This Fortnight in Perl 6, Jan. 19-31, 2005

All~

Welcome to another double feature summary. Sadly, this one was delayed because of an argument that I was/am having with my connection. Fortunately, a generous neighbor has allowed me to use his connection for the time being. So, with that random act of kindness in mind, I bring you �

Perl 6 Language

Perl 6 on E

Rich Morin wondered if Perl 6 would support the features of the language E. Larry told him it would support many of them, and "as a limiting case, you can always back out the entire Perl grammar and install the E grammar in its place." He left this as an exercise for the reader.

Challenge the Palmer ...

Luke Palmer produced answers to Austin Hastings's "How do I" questions. Anyone else with such questions should send them to the list lest Luke's learnedness loses luster.

Refactoring Perl automatically

Matisse Enzer re-re-raised the thread on refactoring Perl. This time he posted a link to EPIC, an Eclipse plugin that uses Devel::Refactor. Unfortunately, I think he is still using the Google Groups interface to post the language. I repeat, emails posted via Google Groups do not make it to the list itself.

Loops, Bare Blocks, and My Head Hurts

Juerd wondered if last/redo would work outside loops. Larry provided the long answer. The short answer is no, things act basically rationally so that return, next, and last all behave as expected.

Where Without Type

Juerd wondered if he could use a where clause with a type. The answer is yes, as it will useful to restrict values admitted to a multi-method. This led me to the evil thought of putting side-effects in a where clause on several multi-methods and watching the pain of resolution cause very difficult to find bugs. Sometime I think that my mind looks for nasty ways to write bad code a little too much.

Auto-threading

Luke Palmer posted his musing about auto-threading. I must say it looks powerful enough to blow off your entire lower body if you shoot yourself in the foot.

Perl 6 Compiler

Pugs

Not just ugly dogs anymore, they are also P6 interpreters written in Haskel. It sounds really cool to me.

Parrot

MMD and Meta-Stuff

In a thread posted also to p6l (sorry about that), I attempt to explain how parrot object system is already very close to the Common Lisp Object System and why it should become even closer. I am not sure if I succeeded. My ability to express really abstract thoughts without a whiteboard is poor.

RT Clean Up

Will Coleda has put an extensive amount of work into cleaning up Parrot for public presentation with a focus on RT and organization.

argv[0]

Wukk (who is Will when my fingers slip off key) wants the name of the invoked executable. Dan upped the ante by offering the full and base name variants of the interpreter, the program, and the invoked thing.

OSCON

Robert Spier put out a call for OSCON proposals.

Read and Readline

Matt Diephouse has been cleaning up read and readline.

Test_main.c

Bloves posted a patch updating test_main.c. Unfortunately, it turns out that this file is obsolete and needs removal.

MinGW Support

François Perrad provided a patch to improve MinGW support. Leo applied it.

Compile Problems

Will Coleda has a failing fresh build. Warnock applies.

NCI Improvements

Bernhard Schmalhofer provided a patch improving NCI stuff. Leo applied it.

VS.NET 2003

Sriram Krishnan fought through a build on VS.NET 2003. He overcame problems and even posted a summary. Unfortunately, he posted all of this to the Google Groups interface, as none of it made it back to the list.

Dynclasses Missing Dependencies

Leo put out a request for fixes to the problem with dynclasses missing dependencies.

Generational GC

Leo has put in the first part of his Generational GC system. It has bugs, but it is also cool.

Preemptive Multi-tasking via Continuations

Hugh Arnold wants to use timers and continuations to implement preemptive multitasking in a single threaded application. He wants to know if Parrot will support it. I think that the answer is yes.

Bound Methods, Attribute Lookup, and Python

Dan, Leo, and Sam have started to go around about how to implement method lookup and currying. Best of luck to all.

lib/Make.pm

Matt Diephouse noticed that lib/Make.pm is out of date and unneeded. He asked for its removal.

Interacting with the Mailing Lists

Someone suggested that I add a description of the preferred modality for interacting with the mailing lists summarized herein. I think this is a good idea and will add it to the standard footer.

The Usual Footer

To post to any of these mailing lists please subscribe by sending email to perl6-internals-subscribe@perl.org, perl6-language-subscribe@perl.org, or perl6-compiler-subscribe@perl.org. If you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send feedback to ubermatt@gmail.com.

Throwing Shapes


Sometimes data processing is better when separated into different processes that may run on the same machine or even on different ones. This is the well-known client-server technique. You can do it using a known protocol (such as http) or by developing your own, specific protocol. This approach needs implementation for constructor and parser procedures for each packet type (request and response). It's possible for different packets to have the same structure so the constructor and parser will be always the same. Perhaps the simplest solution is to have key/value pairs packed with newline characters or with other separators inside a text block. Binary form with length encoding is another solution.

In an attempt to simplify this client-server interaction, the Remote Procedure Call (RPC) technique appeared. It tries to map functions inside the client code to their counterparts inside the server. RPC hides all the details between a client function call and the server function's response. This includes argument serialization (to make data appropriate to transfer over the net, also known as marshaling), transport, the server function call, and returning response data back to the client (also serialized). In some implementations, RPC also tries to remove requirements for the client and the server to run on the same operating system or hardware, or to be written in the same programming language.

In the Perl world there are several modules that offer different kinds of RPC, including RPC::Simple, RPC::XML, DCE::RPC, and more.

In this article I'll explain how to use Perl-specific features to develop a compact RPC implementation that I will name Perl-centric Remote Call (PerlRC). As the name suggests, it will run only with Perl clients and servers.

Shape

PerlRC needs to simulate a function call environment that seems familiar to the client. This requires handling the four key properties of a function call:

  • Function name
  • Function arguments
  • Calling context
  • Return data

The design of the Perl language allows generic argument handling, which means that it is possible to handle arguments without knowing them before the function call. There are also ways to discover the calling context. Finally, the caller can handle results in the same way as the called function's arguments -- generically, without knowing their details until the function call returns.

With this in mind, the PerlRC code must follow these steps:

  • Creating Transport Containers

    Essentially these are the request and response packets. I'll use hashes for both. Each one will be serialized to a scalar which the code will send to the other side with a trailing newline terminator.

    A request container resembles:

    # request hash
      $req1 = {
                'ARGS' => [          # arguments list
                            2,
                            8
                          ],
                'NAME' => 'power',   # remote function name
                'WANTARRAY' => 0     # calling context
              };
    
      # result hash for scalar context
      $res1 = {
                'RET_SCALAR' => 256  # result scalar
              };
    
      # result hash for array context
      $res2 = {
                'RET_ARRAY' => [     # result array
                                 12,
                                 13,
                                 14,
                                 15,
                                 16,
                                 17,
                                 18,
                               ]
              };
    
      # result hash for error
      $res3 = {
                # error description
                'ERROR' => 'No such function: test'
              };
  • Arguments

    To keep things simple, the first argument will represent the remote function name to call. This server must remove this argument from the list before passing on the rest to the remote function. The request container holds the name for the remote function and a separate reference to the argument list.

  • Calling Context Discovery

    Find the calling context with the built-in wantarray function and put this value (0 for scalar and 1 for array context) in the request hash.

  • Transfer Both to the Server

    Serialize the request to scalar and escape newline chars with \n. Append the newline terminator and send it to the server.

  • Unpack Request Data

    The server takes the request scalar, removes the trailing newline terminator, and unpacks the request data into a local hash that contains the function name, the calling context, and the argument list.

  • Server-side Function Call

    Find and call the required function in appropriate context. Take the result data or the error. Create a result container with separate fields for scalar and array contexts and one field for any error.

  • Pack Result Data

    Serialize the result hash, escape newlines, append a terminating newline, and send the result data to the client.

  • Client Unpack of the Result Data

    When the client receives the result container, remove the trailing newline char. Unescape any newline chars and unpack the data into a local result hash. Depending on the calling context, return to the caller either the scalar or array field from the result hash or die with an error description if such exists.

The implementation uses two modules:

  • Storable handles the serialization of arbitrary data. Serializing data converts it to a string of characters suitable for saving or sending across the network and unserializable later into the form of the original. The rest of the article will also refer to this process as packing and unpacking the data.
  • IO::Socket::INET handles the creation of Internet domain sockets.

Both modules are standard in the latest Perl distribution packages.

It is possible to use any serialization module including FreezeThaw, XML::Dumper, or even Data::Dumper + eval() instead of Storable.

Point of No Return

Enough background. Here's the PerlRC implementation of the server:

  use Storable qw( thaw nfreeze );
  use IO::Socket::INET;
  
  # function table, maps caller names to actual server subs
  our %FUNC_MAP = (
                  power => \&power,
                  range => \&range,
                  tree  => \&tree,
                  );                                
  
  # create listen socket
  my $sr = IO::Socket::INET->new( Listen    => 5,
                                  LocalAddr => 'localhost:9999',
                                  ReuseAddr => 1 );
  
  while(4)
    {
    # awaiting connection
    my $cl = $sr->accept() or next; # accept new connection or loop on error
  
    while( my $req = <$cl> ) # read request data, exit loop on empty request
      {
      chomp( $req );
      my $thaw = thaw( r_unescape( $req ) ); # 'unpack' request data (\n unescape)
      my %req = %{ $thaw || {} };            # copy to local hash
      
      my %res;                                # result data
      my $func = $FUNC_MAP{ $req{ 'NAME' } }; # find required function
      if( ! $func ) # check if function exists
        {
        # function name is not found, return error
        $res{ 'ERROR' } = "No such function: " . $req{ 'NAME' };
        }
      else
        {
        # function exists, proceed with execution
        my @args = @{ $req{ 'ARGS' } }; # copy to local arguments hash
        if( $req{ 'WANTARRAY' } )       # depending on the required context...
          {
          my @ret = &$func( @args );    # call function in array context
          $res{ 'RET_ARRAY' } = \@ret;  # return array
          }
        else
          {
          my $ret = &$func( @args );    # call function in scalar context
          $res{ 'RET_SCALAR' } = $ret;  # return scalar
          }  
        }
      
      my $res = r_escape( nfreeze( \%res ) ); # 'pack' result data (\n escape)
      print $cl "$res\n";                     # send result data to the client
      }
    }

The client side is also simple:

  use Storable qw( thaw nfreeze );
  use IO::Socket::INET;
  
  # connect to the server
  my $cl = IO::Socket::INET->new(  PeerAddr => "localhost:9999" ) 
       or die "connect error\n";
  
  # this is interface sub to calling server
  sub r_call
  {
    my %req; # request data
    
    $req{ 'NAME' }      = shift;             # function name to call
    $req{ 'WANTARRAY' } = wantarray ? 1 : 0; # context hint
    $req{ 'ARGS' }      = \@_;               # arguments
    
    my $req = r_escape( nfreeze( \%req ) );  # 'pack' request data (\n escape)
    print $cl "$req\n";                      # send to the server
    my $res = <$cl>;                         # get result line
    chomp( $res );
      
    my $thaw = thaw( r_unescape( $res ) );   # 'unpack' result (\n unescape)
    my %res = %{ $thaw || {} };              # copy result data to local hash
    
    # server error -- break execution!
    die "r_call: server error: $res{'ERROR'}\n" if $res{ 'ERROR' };
    
    # finally return result in the required context
    return wantarray ? @{ $res{ 'RET_ARRAY' } } : $res{ 'RET_SCALAR' };
  }

On both sides there are two very simple functions that escape and unescape newline chars. This is necessary to prevent serialized data that contains newline chars from breaking the chosen packet terminator. (A newline works well there because it interacts well with the readline() operation on the socket.)

  sub r_escape
  {
    my $s = shift;
    # replace all newlines, CR and % with CGI-style encoded sequences
    $s =~ s/([%\r\n])/sprintf("%%%02X", ord($1))/ge;
    return $s;
  }
  
  sub r_unescape
  {
    my $s = shift;
    # convert back escapes to the original chars
    $s =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/ge;
    return $s;
  }

Waiting In The Wings

That's the client and server. Now they need to do something useful. Here's some code to run on the server from a client:

  =head2 power()
   
   arguments: a number (n) and power (p)
     returns: the number powered (n**p)
     
  =cut
  
  sub power
  {
    my $n = shift;
    my $p = shift;
    return $n**$p;
  }
  
  =head2 range( f, t )
   
   arguments: lower (f) and upper indexes (t)
     returns: array with number elements between the lower and upper indexes
              ( f .. t )
  =cut         
  
  sub range
  {
    my $f = shift;
    my $t = shift;
    return $f .. $t;
  }
  
  =head2 tree()
   
   arguments: none
     returns: in scalar context: hash reference to data tree
              in array  context: hash (array) of data tree
       usage:
              $data = tree(); $data->{ ... }
              %data = tree(); $data{ ... }
  =cut
  
  sub tree
  {
    my $ret = {
              this => 'is test',
              nothing => [ qw( ever goes as planned ) ],
              number_is => 42,
              };
    return wantarray ? %$ret : $ret;
  }

To make these available to clients, the server must have a map of functions. It's easy:

  # function table, maps caller names to actual server subs
  our %FUNC_MAP = (
                  power => \&power,
                  range => \&range,
                  tree  => \&tree,
                  );

That's all of the setup for the server. Now you can start it.

The client side calls functions in this way:

  r_call( 'test',  1, 2, 3, 'opa' );  # this will receive 'not found' error
  my $r = r_call( 'power',  2,  8 );  # $r = 256
  my @a = r_call( 'range', 12, 18 );  # @a = ( 12, 13, 14, 15, 16, 17, 18 )
  my %t = r_call( 'tree' );           # returns data as hash
  my $t = r_call( 'tree' );           # returns data as reference
  
  print( "Tree is:\n" . Dumper( \%t ) );
  # this will print:

  Tree is:
  $VAR1 = {
            'number_is' => 42,
            'nothing' => [
                           'ever',
                           'goes',
                           'as',
                           'planned'
                         ],
            'this' => 'is test'
          };
  
  # and will be the same as 
  print( "Tree is:\n" . Dumper( $t ) );

One Wish

At this point everything works, but as usual, someone will want another feature. Suppose that the server and the client sides each had one wish.

The server side wish may be to have a built-in facility to find callable functions so as to build the function map can be built automatically.

Automatic map discovery has one major flaw which is that all functions in the current package are available to the client. This may not be always desirable. There are simple solutions to the problem. For example, all functions that need external visibility within a package could have a specific name prefix. A map discovery procedure can filter the list of all functions with this prefix and map those externally under the original names (without the prefix).

The following code finds all defined functions in the current namespace (the one that called r_map_discover()) and returns a hash with function-name keys and function-code-reference values:

  sub r_map_discover
  {
    my ( $package ) = caller(); # get the package name of the caller
    my $prefix = shift;         # optional prefix
    my %map;

    # disable check for symbolic references
    no strict 'refs';

    # loop over all entries in the caller package's namespace
    while( my ( $k, $v ) = each %{ $package . '::' } ) 
      {
      my $sym = $package . '::' . $k; # construct the full name of each symbol
      next unless $k =~ s/^$prefix//; # allow only entries starting with prefix
      my $r = *{ $sym }{ 'CODE' };    # take reference to the CODE in the glob
      next unless $r;  # reference is empty, no code under this name, skip
      $map{ $k } = $r; # reference points to CODE, assign it to the map
      }
    return %map;
  }

To make the use automatic discovery instead of a static function map, write:

  # function table, maps caller names to actual server subs, initially empty
  our %FUNC_MAP;

  # run the automatic discovery function
  %FUNC_MAP = r_map_discover();

Now %FUNC_MAP has all of the externally-visible functions in the current package (namespace). That means it's time to modify the names in the module to work with automatic discovery. Suppose the prefix is x_:

  sub x_power
  {
    ...
  }
  
  sub x_range
  {
    ...
  }

The server will discover only those functions:

%FUNC_MAP = r_map_discover( 'x_' );

and the client will continue to call functions under their usual names:

  my $r = r_call( 'power',  2,  8 );  # $r = 256
  my @a = r_call( 'range', 12, 18 );  # @a = ( 12, 13, 14, 15, 16, 17, 18 )

That's it for the server's wish. Now it's time to grant the client's wish.

Call remote functions transparently might be most important client wish, avoiding the use of r_call().

Perl allows the creation of anonymous function references. It's also possible to install that reference in a namespace under a real name. The result is a function created at run-time. If the function definition takes place in a specific lexical context, it will still have access to that context even when called later from outside that context. Those functions are closures and they are one way to avoid using r_call():

  sub r_define_subs
  {
    my ( $package ) = caller(); # get the package name of the caller
    for my $fn ( @_ )           # loop over the specified function names
      {
      my $sym = $package . '::' . $fn;    # construct the full symbol name
      no strict;                          # turn off symbolic refs check
      *$sym = sub { r_call( $fn, @_ ); }; # construct and tie the closure
      use strict;                         # turn the check back on
      }
  }
  
  # define/import 'range' and 'tree' functions in the current package
  r_define_subs( 'range', 'tree' );
  
  # now call them as they are normal functions
  my @a = range( 12, 18 );      # @a = ( 12 .. 18 )
  my %t = tree();               # returns data as reference

This approach hides the use of r_call() to only one place which the client doesn't see. Wish granted.

Limits

The biggest limitations of PerlRC relate to serialization.

First of all, both the client and server must have compatible serialization modules or versions. This is crucial! To avoid problems here, either you'll have to write your own serialization code or perform some kind of version check. If you perform this check, be sure to do it before sending a request and response, in plain text, without using serialization at all.

Another problem is in what data you can serialize in the argument or result containers. Holding references there to something outside the same container may pull in more data than you want, if your serialization follows references, or it may not pull in enough data if your serialization process is very simple. Also there is no way to serialize file handles, compiled code, or objects (which are not in the same container really). In some cases, serializing code and objects may be possible if the serialization modules supports such features (as do Storable and FreezeThaw), if you have the required class modules on both sides, and if you trust code on either side.

The documentation of the serialization modules explain further limitations and workarounds for both approaches.

Conclusion

There is a bit more work to do on PerlRC before using it in production, but if you need simple RPC or you need to tweak the way RPC deals with data or communication, you may have good experiences writing your own implementation instead fitting your application around readymade modules. I hope this text is a good starting point.

Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en