March 2001 Archives

A Simple Gnome Panel Applet


Table of Contents

Program overview
Initialization
The callbacks
Conclusion

Gnome is the desktop environment of choice on my home Linux system because it is feature-packed and user friendly. Gnome is also flexible, and thanks to the Gtk-Perl module and associated desktop toolkit bindings, I can use my favorite programming language to further customize and extend my Gnome environment.

This article shows how a useful Gnome tool can be built in an afternoon. It is also an example of some common techniques to employ when doing this sort of GUI programming, including widget creation, signal handling, timers, and event loops. It also covers some Perl basics, in review. So, read on and you may be inspired to write some Gnome applications of your own.

Download the source code referred to in this article.

Gnome

On a Gnome desktop, the panel contains a variety of buttons and other widgets which serve to launch applications, display menus, and other common functions. It's standard desktop fare, like the Microsoft Windows Start menu.

An applet is a particular kind of Gnome application, which resides within and operates on the panel itself. The Gnome distribution comes with several of these including a variety of clocks, the Game of Life, and system resource utilization monitors. The Gtk-Perl module enables a Perl programmer to create custom Gnome panel applets, and many other GUI-related applications.

The Gnome panel applet we'll build finds the local host's default TCP/IP gateway and displays the gateway's status on a button in the applet. When the button is in an "off" position the gateway is not polled (Figure 1).

Gnome gateway not polled

Fig. 1: Gnome Gateway in the "off" position.

When the button is on, the gateway is polled at scheduled intervals and the button's label is updated with the result: (Figures 2a & 2b) response or non-response.

Gnome gateway polled

Fig. 2a: Gnome Gateway in the "on" position.

Gnome gateway polled

Fig. 2b: Gnome Gateway not available.

This diagnostic may be used to regularly and unobtrusively report the status of the local network relative to the machine's default gateway. Check the button's label to see how things are faring on the upstream network.

The applet uses the system commands netstat and ping familiar to Unix users, and system and network admins.

Program overview

The top-level code in the script is lines 1 through 21. Lines 23 through 69 define subroutines. One subroutine is called by our code (fetch_gateway), but two others are "callbacks" (check_gateway and reset_state). A callback is a subroutine that will be called by the Gnome code when something happens -- for example, a timer expires or button gets clicked. Now, let's see what we can learn about how the application works.

Initialization

Line 3 indicates that we'll be using the Gnome module. Gnome.pm is distributed with the Gtk-Perl package. The program discussed here has developed against Gtk-Perl version 0.7003. Gnome.pm requires separate installation: download and unpack Gtk-Perl, install it, and then change directories into the GdkImlib and Gnome distribution and install them too. If you want to develop panel applets (as we're doing here) you'll need to append the build option --with-panel to the end of the usual perl Makefile.PL portion of the install process:

  perl Makefile.PL --with-panel

Although Gnome.pm hasn't made it to Version 1.0 yet I've found it to be stable. The biggest problem is the lack of documentation.

Lines 5 and 6 initialize a new AppletWidget object in $a. This object is the container for all the doodads that will be part of our applet. Line 8 creates a label for use when our button is in the "off" position.

Line 10 creates a timer. The prototype for the Gtk->timeout_add function is:

  Gtk->timeout_add( $interval, \&function, @function_data );

Here our interval is 20000 (this value is in milliseconds, so the timer will go off every 20 seconds), and the function to be called when the timer goes off is check_gateway. We could use the third parameter to pass some data into the check_gateway function if it were appropriate to do so. In this case it isn't necessary.

Line 11 creates a new ToggleButton object in $button, and labels it "off".

Line 12 registers the other callback in this application. This one will be called when a particular "signal" occurs within Gnome. These aren't normal Unix signals like SIGINT and SIGCHLD, but GUI events. In this case, the event is "clicked". The ToggleButton widget also has the signals "pressed", "released", "enter", and "leave", each of which is emitted in response to either actions of the mouse pointer or a function call, such as $button->pressed(). In our application, Gnome will call reset_state() whenever ToggleButton $button is clicked.

Line 14 sets the size of the button to be a square with sides of 50 pixels. This is a good fit for the default Gnome panel. Gnome uses global theme and style information to determine how the button is to be drawn: line, color, shadow, etc. Line 15 calls the button's show method, indicating that we're finished setting its attributes and that it is ready for display. Line 16 adds the button to the applet. Technically, we've packed the ToggleButton widget into the AppletWidget container by invoking the AppletWidget's add method on the ToggleButton. The ToggleButton is now a "child" of its container. At line 17 the applet is allowed to be visible by calling its show method as well. A widget's children will not be displayed until the parent's show method is invoked.

Line 19 calls the fetch_gateway routine to gather in the local host's default TCP/IP gateway. In that subroutine, `netstat -rn` captures the local routing table, returning all addresses in dotted quad +notation. The test in line 30 gets triggered by the IP address +0.0.0.0, which indicates the default gateway. When it matches, the +gateway's name is looked up from the IP address, and stored in the +variable $hostname. Finally, we translate this value to lowercase, which will look better when we finally display it on the button. Then fetch_gateway returns.

At line 21, we're ready to hand off to the gtk_main event loop, which is responsible for drawing the application on the screen and managing user interaction. At this point, our only interface with the application will be through signal handling and callback functions. The Gnome Toolkit (GTK) is event driven: once we enter gtk_main the application stays put until an event occurs (caught via a signal) and the associated callback function is invoked. Therefore, we should have completed all of our application setup beforehand.

The callbacks

Now let's examine the two callback functions: reset_state, associated with catching a "clicked" signal on our button (line 12); and check_gateway, associated with the timer (line 10).

At line 38, we query the state of the toggle button by invoking its get_active method. This returns 0 if the button is off and 1 if it is on. By default, the ToggleButton widget has one child -- its own label. We label the button with the contents of $off_label if it is in the OFF position or the string "Wait" if it is in the ON position (figure 3) because we know that check_gateway is going to be called soon (within the next 20 seconds). Part of check_gateway's job is to update the button with the status of our TCP/IP gateway, the whole point of this applet.

Gnome gateway waiting

Fig. 3: Gnome Gateway waiting.

Line 53 stores either the value in $hostname or the string "gateway", depending upon whether or not $hostname is longer than eight characters (lines 47-52). This is the number of characters that will fit comfortably on one line within the button's label.

If the button is in the ON position (line 55), we go ahead and attempt to ping the gateway with a single ICMP packet, redirecting STDOUT and STDERR to the round-file. We don't care about the output of the ping command, only it's return value, so we execute the command via system(). Ping will return 0 upon success (gateway alive) and something else if it fails (probably because the gateway is dead), so we check the result and update the button's label appropriately in lines 59-66.

We want check_gateway to continue to be called, every 20 seconds, until the application terminates. So, at line 68 the function returns a true value. A true return value will allow further execution of a timer's callback function while a false return value forces the opposite behavior.

Note that 20 seconds is enough time to allow the call to ping to timeout and return a value to the application. It is also an appropriate level of resolution for this kind of discovery activity: If information about the status of my gateway is at most 20 seconds old, then I'm happy. Your mileage may vary, of course.

Conclusion

It's easy to write Gnome applets in Perl. This simple example showed you the basic elements of Gnome programming, including the event model and callbacks. Go forth and hack your own applet!

This Week on p5p 2001/03/26



Notes

You can subscribe to an email version of this summary by sending an empty message to perl5-porters-digest-subscribe@netthink.co.uk.

Please send corrections and additions to perl-thisweek-YYYYMM@simon-cozens.org where YYYYMM is the current year and month. Changes and additions to the perl5-porters biographies are particularly welcome.

I looked at 263 messages in total this week.

glob(), Cwd, etc.

Benjamin Stugars started running amok over anything related to directories. Firstly, he looked at POSIX::getcwd and made it call the getcwdsystem call if available instead of doing `pwd`, avoiding a fork.

Next, he implemented Cwd in XS; however, this can't replace the current Cwd module, because miniperl needs to use it to install things. Instead, he had Cwd::fastcwd bootstrap the XS module.

Then he turned to glob, finding that, firstly, it gives a different sorting order (ASCII rather than case-insensitive alphabetical) from the old implementation which merely called out to csh. Gurusamy Sarathy fixed this up by adding an option for csh compatibility. Ben put together a test suite for it.

He also found that glob in a scalar context returns the last file found, rather than the number of files found, but this was deemed to be a feature.

Ben also cleaned up the docs for Cwd.pm, and made Encode work under warnings and stricture. Wow. A good week's work there; thanks, Ben.

use Errno is broken

Tim Jenness and I discovered a bug in h2xs. When you make an XS module, one of the things you might want to do is make a set of constants available to Perl, so that Perl programmers can call functions with the appropriate flags or whatever. The way h2xs lets you do this is by defining an autoloaded subroutine that calls the XS function constant. If constant returns a value, that's used. If it doesn't know anything about the constant in question, it sets errno to EINVAL. The autoload sub then checks whether $! is contains "Invalid" (not very I18N-friendly) or whether $!{EINVAL} is set.

%! is a magical hash; it maps error constant numbers, like ENOBRANE, EBADF and so on, to true or false values, depending on the value of errno. This lets you do error checking both portably and in a local-independent manner.

It's supposed to be activated by use Errno, but the stub modules provided by h2xs didn't actually use the module; I sent in a patch to make it use Errno.

However, Sarathy revealed that using %! in a Perl program should cause Errno to be used automagically - unfortunately, it didn't, so more hacking is required.

Lexical Warnings

Mark-Jason Dominus found that lexical warnings in some cases aren't really lexical. Specifically, a module turning warnings on can trigger a "variable used once" warning in the main program. He correctly pointed out that:

The documentation in warnings.pm says:

        the pragma setting will not leak across files (via `use',
        `require' or `do').

If it doesn't keep this promise, then there is little benefit to be had over using $^W

Paul Marquess, the lexical warning pumpking, said that this was a design decision, but couldn't remember why. After some hints from Sarathy, he started working on a patch. The new rules are to be:

A variable will be checked for the "use once" warnings if:
  1. It is in the scope of a use warnings 'once'
  2. It isn't in the scope of the warnings pragma at all AND $^W is set.

Otherwise it won't be checked at all.

Scalar repeat bug

Dominus found another interesting bug: scalar and the repeat operator x don't play nice together. For instance:

    print scalar ((1,2,3)x4) # Prints "123333"

Robin Houston came to the rescue again with this typically brilliant analysis:

Really it's a perl bug, and another that goes back at least to the earliest perl I have available (5.00404).

    perl -e 'print -((1,2)x2)'

will print 1-22. What happens is roughly:

  • the values (1,2) are put onto the stack
  • pp_repeat sees that it's in a scalar context, so it changes the 2 on the top of the stack to 22. The 1 remains beneath.
  • pp_negate negates the top of the stack, giving "-22".
  • so now the list (1, -22) is on the stack, and is printed.

The patch below makes pp_repeat drop the extraneous values, if its context is scalar but the OPpREPEAT_DOLIST flag is set.

open() trickery

Nick Ing-Simmons came bearing gifts: specifically, funky new forms of open. So far he has implemented the list form of pipes, so that:

    $pid = open($fh, "-|", tac => $file);

will run tac $file and pipe the output to the filehandle.

Next up came duplicating filehandles and file descriptors. His examples:

     open(my $dup,"<&",$fh) # can now duplicate anonymous handles.
     open(my $num,"<&",42)  # Integers are considered file descriptors.

And also, something I know a lot of people have always wanted:

    open(my $fh, "<", \$string)

which reads from the string as if it was a file on disk.

But no, he didn't stop there!

    open(my $tempfile, "<+", undef);

is going to give you an anonymous temporary file on systems that support it. He also suggested opening different files on the read and write halves of a filehandle. That's to say, you could copy files like this:

    open(my $fh, "<", $read_from, ">", $write_to);
    while (<$fh>) {
        print $fh $_;
    }

By this time I was positively squealing with excitement.

Russ Allbery suggested a brilliantly devious way of doing IO layers at the Perl level: allow

    open (my $fh, "<", \&coderef);

which would call a subroutine every time more data was needed.

Net::Ping

Colin McMillen had some suggestions (and, heavens above, an implementation) for some changes he wanted to make to Net::Ping. I'll quote his summary here, and you can read the details in his mail .

  1. Removal of alarm() function.
  2. Incorrect returns of false removed from TCP ping
  3. Removal of unneeded warning in the module's POD
  4. Creation of a new "external" protocol
  5. Creation of a new "auto" protocol
  6. Documentation update
  7. Change in the default ping protocol
  8. Allowing for user-specified port in TCP ping
  9. UDP ping fixes?

There was some discussion of fine points; specifically, whether one should necessarily return false when, for instance, a ping may be blocked by a firewall. Abigail came up with a neat solution using an overloaded value for the response; in the end, however, an object-oriented approach was taken. Sarathy suggested that on Windows, one can use system 1, ... to spawn a child process.

Colin took away the discussion and came back with an excellent set of patches, which got applied.

New modules in core

Jarkko sneaked in a few more modules to the core: Digest::MD5 and MIME::Base64 are in there now. There were a couple of teething problems as the tests for MIME::QuotedPrint were marked as binary files by Perforce, and Nick spotted an ASCI-ism in the code while working on yet another piece of PerlIO trickery. (He wanted

use MIME::QuotedPrint; binmode(\*STDOUT,":Object(MIME::QuotedPrint)"); print "Just my 2? on the MIME stuff \n",scalar( '_' x 80),"\n";

to produce

    Just my 2=A2 on the MIME stuff=20
    ___________________________________________________________________________=
    _____

. Uhm, yum, I think.)

YAPC Registration

Rich Lafferty announced that the Third North American Yet Another Perl Conference registeration was open. YAPC::America::North will be held at McGill University, Monteral, Quebec from Wednesday June 13th to Friday June 15th.

Don't delay, register today!

Various

I called for pack() recipes, since I'm trying to write a tutorial about the underused functions pack() and unpack(). If you have any neat things you do with pack(), please send them to me.

Peter Prymmer got 5.6.1 ready to go on OS/390, with updates to the documentation and test suites.

Chris Nandor got some more Mac portability patches in.

Radu Greab provided something I don't quite understand to work around socket brokenness in Linux.

Paul Johnson came up with some patches for mingw32; mainly to bring it closer towards Borland C-ness. (or, alternatively, further away from VC++) Nick I-S queried this, as he thought mingw32 was trying to be more VC++ compatible, but it seemed to be the right thing.

Jarkko documented how to use Third Degree and Pixie, two memory leak and profiling tools, in perlhack.pod.

Tim Jenness went through the typemap file and found some oddities in it; as well as fixing them up, he came up with an XS module to test them.

He also found that naughty, naughty Compaq shipped the Perl library separately from the Perl binary with Digital U... uhm, I mean Tru64 v5.1. Unfortunately, perl -V needs Config.pm, which is in the "optional subset" containing the rest of the Perl library. Oops.

Until next week I remain, your humble and obedient servant,


Simon Cozens

DBI is OK

DBI is OK


Table of Contents

Is Table Mutation a Big Problem?

Making Queries Easier

Placeholders

Binding Columns

Modules Built on DBI

A recent article on Perl.com recommended that most Perl programs use the DBIx::Recordset module as the standard database interface. The examples cast an unfavorable light on DBI (which DBIx::Recordset uses internally). While choosing an interface involves trade-offs, the venerable DBI module, used properly, is a fine choice. This response attempts to clear up some misconceptions and to demonstrate a few features that make DBI, by itself, powerful and attractive.

Since its inception in 1992, DBI has matured into a powerful and flexible module. It runs well on many platforms, includes drivers for most popular database systems and even supports flat files and virtual databases. Its popularity means it is well-tested and well-supported, and its modular design provides a consistent interface across the supported backends.

Is Table Mutation a Big Problem?

The previous article described an occurrence called ``table mutation,'' where the structure of a table changes. Mr. Brannon says that the DBI does not handle this gracefully. One type of mutation is field reordering. For example, a table named 'user' may originally be:

        name    char(25)
        email   char(25)
        phone   char(11)

At some point, the developers will discover that the 'name' field is a poor primary key. If a second ``John Smith'' registers, the indexing scheme will fail. To ensure uniqueness, the coders could add an 'id' field using an auto-incremented integer or a globally unique identifier built into the database. The updated table might resemble:

        id      bigint
        name    char(25)
        email   char(25)
        phone   char(11)

Mr. Brannon contends that all code that generates a SQL statement that resolves to 'SELECT * FROM user' will break with this change. He is correct. Any database request that assumes, but does not specify, the order of results is susceptible.

The situation is not as bad as it seems. First, DBI's fetchrow_hashref() method returns results keyed on field names. Provided the existing field names do not change, code using this approach will continue to work. Unfortunately, this is less efficient than other fetching methods.

More importantly, explicitly specifying the desired fields leads to clearer and more secure code. It is easier to understand the purpose of code that selects the 'name' and 'email' fields from the 'user' table above than code that assumes an order of results that may change between databases. This can also improve performance by eliminating unnecessary data from a request. (The less work the database must do, the better. Why retrieve fields that won't be used?)

From a pragmatic approach, the example program will need to change when the 'id' field is added. The accessor code must use the new indexing approach. Whether the accessors continue to function in the face of this change is irrelevant -- someone must update the code!

The same arguments apply to destructive mutations, where someone deletes a field from a table. While less likely than adding a field, this can occur during prototyping. (Anyone who deletes a field in a production system will need an approved change plan, an extremely good excuse or a recent CV.) A change of this magnitude represents a change in business rules or program internals. Any system that can handle this reliably, without programmer intervention, is a candidate for Turing testing. It is false laziness to assume otherwise.

Various classes of programs will handle this differently. My preference is to die immediately, noisily alerting the maintainer. Other applications and problem domains might prefer to insert or to store potentially tainted data for cleansing later. It's even possible to store metadata as the first row of a table or to examine the table structure before inserting data.

Given the hopefully rare occurrence of these mutations and the wide range of options in handling them, the DBI does not enforce one solution over another. Contrary to the explanation in the prior article, this is not a failing of the DBI. (See a November 2000 PerlMonks discussion at http://www.perlmonks.org/index.pl?node_id=43748 for more detail.)

Making Queries Easier

Another of Mr. Brannon's disappointments with the DBI is that it provides no generalized mechanism to generate SQL statements automatically. This allows savvy users to write intricate queries by hand, while database neophytes can use modules to create their statements for them. The rest of us can choose between these approaches.

SQL statements are plain text, easily manipulated with Perl. An example from the previous article created an INSERT statement with multiple fields and a hash containing insertable data. Where the example was tedious and hard to maintain, a bit of editing makes it general and powerful enough to become a wrapper function. Luckily, the source hash keys correspond to the destination database fields. It takes only a few lines of code and two clever idioms to produce a sane and generalized function to insert data into a table.

        my $table = 'uregisternew';
        my @fields = qw( country firstname lastname userid password address1 city 
                state province zippostal email phone favorites remaddr gender income 
                dob occupation age );
        my $fields = join(', ', @fields);
        my $values = join(', ', map { $dbh->quote($_) } @formdata{@fields});
        $sql = "INSERT into $table ($fields) values ($values)";
        $sth = $dbh->prepare($sql);
        $sth->execute();
        $sth->finish();

We'll assume that %formdata has been declared and contains data already. We've already created a database handle, stored in $dbh, and it has the RaiseError attribute set. The first two lines declare the database table to use and the fields into which to insert data. These could just as well come from function arguments.

The join() lines transforms lists of fields and values into string snippets used in the SQL statement. The map block simply runs each value through the DBI's quote() method, quoting special characters appropriately. Don't quote the fields, as they'll be treated as literals and will be returned directly. (Be sure to check the DBD module for your chosen database for other notes regarding quote().)

The only tricky construct is @formdata{@fields}. This odd fellow is known as a hash slice. Just as you can access a single value with a scalar ($formdata{$key}), you can access a list of values with a list of keys. Not only does this reduce the code that builds $values, using the same list in @fields ensures that the field names and the values appear in the same order.

Placeholders

A relational database must parse each new statement, preparing the query. (This occurs when a program calls the prepare() method). High-end systems often run a query analyzer to choose the most efficient path. Because many queries are repeated, some databases cache prepared queries.

DBI can take advantage of this with placeholders (also known as 'bind values'). This is especially handy when inserting multiple rows. Instead of interpolating each new row into a unique statement and forcing the database to prepare a new statement each time, adding placeholders to an INSERT statement allows us to prepare the statement once, looping around the execute() method.

        my $fields = join(', ', @fields);
        my $places = join(', ', ('?') x @fields);
        $sql = "INSERT into $table ($fields) values ($places)";
        $sth = $dbh->prepare($sql);
        $sth->execute(@formdata{@fields});
        $sth->finish();

Given @fields containing 'name', 'phone', and 'email', the generated statement will be:

        INSERT into users (name, phone, email) values (?, ?, ?)

Each time we call execute() on the statement handle, we need to pass the appropriate values in the correct order. Again, a hash slice comes in handy. Note that DBI automatically quotes values with this technique.

This example only inserts one row, but it could easily be adapted to loop over a data source, repeatedly calling execute(). While it takes slightly more code than interpolating values into a statement and calling do(), the code is much more robust. Additionally, preparing the statement only once confers a substantial performance benefit. Best of all, it's not limited to INSERT statements. Consult the DBI documentation for more details.

Binding Columns

In a similar vein, the DBI also supports a supremely useful feature called 'binding columns.' Instead of returning a list of row elements, the DBI stores the values in bound scalars. This is very fast, as it avoids copying returned values, and can simplify code greatly. From the programmer's side, it resembles placeholders, but it is a function of the DBI, not the underlying database.

Binding columns is best illustrated by an example. Here, we loop through all rows of the user table, displaying names and e-mail addresses:

        my $sql = "SELECT name, email FROM users";
        my $sth = $dbh->prepare($sql);
        $sth->execute();
        my ($name, $email);
        $sth->bind_columns(\$name, \$email);
        while ($sth->fetch()) {
                print "$name <$email>\n";
        }
        $sth->finish();

With each call to fetch(), $name and $email will be updated with the appropriate values for the current row. This code does have the flaw of depending on field ordering hardcoded in the SQL statement. Instead of giving up on this flexibility and speed, we'll use the list-based approach with a hash slice:

        my $table = 'users';
        my @fields = qw( name email );
        my %results;
        my $fields = join(', ', @fields);
        my $sth = $dbh->prepare("SELECT $fields FROM $table");
        $sth->execute();
        @results{@fields} = ();
        $sth->bind_columns(map { \$results{$_} } @fields);
        while ($sth->fetch()) {
                print "$results{name} <$results{email}>\n";
        }
        $sth->finish();

It only takes two lines of magic to bind hash values to the result set. After declaring the hash, we slice %results with @fields to initialize the keys we'll use. Their initial value (undef) doesn't matter, as it is only necessary that they exist. The map block in the bind_columns() call creates a reference to the hash value associated with each key in @fields. (This is the only required step of the example, but the value initialization in the previous line makes it more clear.)

If we only display names and addresses, this is no improvement over binding simple lexicals. The real power comes with more complicated tasks. This technique may be used in a function:

        sub bind_hash {
                my ($table, @fields) = @_;
                my $sql = 'SELECT ' . join(', ', @fields) . " FROM $table";
                my $sth = $dbh->prepare($sql);
                $sth->execute();
                my %results;
                @results{@fields} = ();
                $sth->bind_columns(map { \$results{$_} } @fields);
                return (\%results, sub { $sth->fetch() });
        }

Calling code could resemble:

        my ($res, $fetch) = bind_hash('users', qw( name email ));
        while ($fetch->()) {
                print "$res->{name} >$res->{email}>\n";
        }

Other options include passing in references to populate or returning an object that has a fetch() method of its own.

Modules Built on DBI

The decision to use one module over another depends on many factors. For certain classes of applications, the nuts and bolts of the underlying database structure is less important than ease of use or rapid development. Some coders may prefer a higher level of abstraction to hide tedious details for simple requirements. The drawbacks are lessened flexibility and slower access.

It is up to the programmer to analyze each situation, choosing the appropriate approach. Perl itself encourages this. As mentioned above, DBI does not enforce any behavior of SQL statement generation or data retrieval. When the techniques presented here are too onerous and using a module such as Tangram or DBIx::Recordset makes the job easier and more enjoyable, do not be afraid to use them. Conversely, a bit of planning ahead and abstraction can provide the flexibility needed for many other applications. There is no single best solution, but Perl and the CPAN provide many workable options, including the DBI.

chromatic is the author of Modern Perl. In his spare time, he has been working on helping novices understand stocks and investing.

This Week on p5p 2001/03/19



Notes

You can subscribe to an email version of this summary by sending an empty message to perl5-porters-digest-subscribe@netthink.co.uk.

Please send corrections and additions to perl-thisweek-YYYYMM@simon-cozens.org where YYYYMM is the current year and month. Changes and additions to the perl5-porters biographies are particularly welcome.

There were just under 300 messages this week.

5.6.1-TRIAL3 is out there

Just before going to press, Sarathy pushed trial 3 out of the door; get it from $CPAN/authors/id/G/GS/GSAR/perl-5.6.1-TRIAL3.tar.gz.

Please test it as widely as possible, run your favourite bugs through it, and so on. If you're not one of the smoke testers, now would be a good time to start; subscribe to daily-build@perl.org and get yourself a copy of SmokingJacket - that would be a great help for us.

Robin Houston vs. goto

Why anyone wants to maintain goto is quite beyond me, but Robin Houston's been doing it rather well this week. The first bug he fixed was to do with lexical variables losing their values when a goto happens in their scope:

    for ($i=1; $i<=1; $i++) {
        my $var='val';
        print "before \$var='$var'\n";
        goto jump;
        jump:
        print  "after \$var='$var'\n";
    }

Here's his analysis:

Wow! This is a venerable bug, dating at least back to 5.00404.

When a for(;;) is compiled, there's no nextstate immediately before the enterloop, and so when pp_goto does this:

            case CXt_LOOP:
                gotoprobe = cx->blk_oldcop->op_sibling;
                break;

gotoprobe ends up pointing into a dead end.

That means that the label doesn't get found until the next context up is searched; and so the loop context is left and re-entered when the goto is executed.

Next, he fixed a bug which meant that

eval { goto foo; }; die "OK\n"; foreach $i(1,2,3) { foo: die "Inconclusive\n"; }

would not get caught by the eval. This was because goto failed to pop the stack frames as it left an eval.

For his hat-trick, he made goto work in nested eval structures.

my $foo if 0

One of the (many) perl5-porters dirty secrets is that if you say

    sub marine {
        my $scope = "up" if 0;
        return $scope++;
    }
    print marine(),"\n" for 0..10;

you get the lexical variable $scope increasing from 0 to 10, instead of being reinitialised every time. In effect, you get a static variable. Very nice.

Of course, this is completely undocumented and you'd be foolish to rely on it, but it generates no warnings and passes use strict happily. It's come about completely by accident, since the optimiser optimises away the initialisation. David Madison claimed it was a bug, but most people seem to think of it as an unexploded feature. Jarkko will be adding lexical warnings to mark it as deprecated. Johan also pointed out that open with more than 3 arguments doesn't currently pass the later arguments onto the program being called: his spectacular example was

    open (FH, "-|", "deleteallfiles", "-tempfilesonly")

which deletes all files, not just the temp files. Nick said it was on his list of things to do, and called for patches.

More POD Nits

Michael Stevens produced some more POD patches this week, and the discussion about how strict podchecker can be continued from last week, spurred on by Jarkko sending Michael's patched to the maintainers of core modules. Ilya, Sarathy and others grumbled about podchecker's insistence on

    =over 4

instead of just `plain'

    =over

Michael patched podchecker to remove that warning, although Jarkko was impressed that perl5-porters had managed to degenerate to squabbling over 2 bytes of documentation. A new low. Not content, Ilya managed to get it down to 1 byte, by grouching about

    L<foo
     bar>

being an error where

    L<foo bar>

was not. Michael patched podchecker to remove that warning too, which has the nice result that you can reformat paragraphs in your POD (with Alt-Q or gqap, depending on religion) without worrying about it breaking the semantics.

More on the reset bug

Jarkko reported that Sarathy's patch to fix the reset bug of a couple of weeks ago still produced erratic results on various platforms. Alan found the heap corruption in Purify, and reported that "this bug is unchanged since the last discussion of it on the list a couple of weeks back." (But noted that the anon sub leak is finally fixed!) Jarkko pointed out that this was because Sarathy's patch wasn't applied to the repository, and that the problem was that the linked list of pattern match operators was getting corrupted. That's to say, when you have a regular expression like /.a\d/ it contains a list of nodes: any character, a literal a, any digit. For some reason, when the regular expression was being freed from memory, one of the nodes, let's say the literal a, was pointing not to an "any digit" node, but to West hyperspace.

Radu Greab zeroed in on the problem: the regular expression was being cleared twice. He provided the beginnings of a patch; Sarathy suggested it was on the right lines, but wanted to reference count regular expression nodes. His version of the patch caused all sorts of problems, and Radu came up with something else to change the regular expression node list into a doubly-linked list. That was still imperfect, and Sarathy wanted to know why his patch caused all sorts of problems. Jarkko unfortunately ran out of time to do some debugging on this.

So it's still out there, and a very, very weird bug it is too. Something is going very wrong with the freeing of PMOPs. We've only seen it on Linux and Tru64 when the -DDEBUGGING and -g flags are on, so far. Stranger and stranger.

Distributive arrow operator

David Lloyd suggested that the arrow operator should be distributive, so that:

    ($foo, $bar)->baz();

does

    $foo->baz(); $bar->baz();

Not a bad idea, although various people pulled out the old "would break old code" mainstay.

He developed his idea, suggesting that

    ([1,2], [3,4])->[0]

should return 1,3, and also that

    ($arrayref)->[0..15]

should return

    map { $arrayref->[$_] } 0..15

I thought this was beautiful, and tossed it over to the perl6-language list, together with the natural extension of

    @a = ($foo, $bar) . $baz     # @a = map { $_.$baz } ($foo, $baz)

which reminded MJD of APL.

Various

I made a misattribution last week, saying that Philip Newton provided a documentation patch for use integer. Although Philip does some storming work, that wasn't one of his - it belonged to Robert Spier. Sorry, Robert. John Allen produced another one this week.

Ilya made perldoc work on OS/2. John Allen suggested that we should use the subroutine attribute syntax to extend builtins: time :milli() for hi-res time, for instance. Nothing happened.

Charles Lane (he of the weird email address) produced a fix to tidy up the test suite a little; tests should use $^X instead of ./perl to keep VMS and other dinosaurs happy.

Kurt Starsinic has been patching up some of the standard utilities to be warnings and strict clean; so far we've seen his patch to h2ph.

Merijn found loads and loads of bugs in the 5.6.1 trials on AIX and HP/UX, which Sarathy ran around cleaning up. Good work, both.

Brian Ingerson has, sadly, decided that his excellent Inline module is not going to be mature enough for 5.8.0. Still, I heartily hope something like it ships with Perl 6. Have you played with Inline yet? No? Go and play with Inline immediately, and until next week I remain, your humble and obedient servant,


Simon Cozens

Creating Modular Web Pages With EmbPerl


Table of Contents

Getting Started

Hello World

Web Site Global Variables

Modular Files

Modular File Inheritance

Subroutines in EmbperlObject

Conclusions

This tutorial is intended as a complement to the Embperl documentation, not a replacement. We assume a basic familiarity with Apache, mod_perl and Perl, and the Embperl documentation. No prior experience with EmbperlObject is assumed. The real purpose is to give a clearer idea of how EmbperlObject can help you build large Web sites. We give example code that can serve as a starting template and hints about the best practices that have come out of real experience using the toolkit. As always, there is more than one way to do it!

Since EmbperlObject is an evolving tool, it is likely that these design patterns will evolve over time, and it is recommended that the reader check back on the Embperl Web site for new versions.

Motivation: Constructing Modular Web Sites

Embperl is a tool that allows you to embed Perl code in your HTML documents. As such, it could handle just about everything you need to do with your Web site. So what is the point of EmbperlObject? What does it give us that we don't already get with basic Embperl?

As often seems to be the case with Perl, the answer has to do with laziness. We would all like the task of building Web sites to be as simple as possible. Anyone who has had to build a non-trivial site using pure HTML will have quickly experienced the irritation of having to copy-and-paste common code between documents - stuff like navigation bars and table formats. We have probably all wished for an ``include'' HTML tag. EmbperlObject goes a long way toward solving this problem, without requiring the developer to resort to a lot of customized Perl code.

In a nutshell, EmbperlObject extends Embperl by enabling the construction of Web sites in a modular, or object-oriented, fashion. I am using the term ``object-oriented'' (OO) loosely here in the context of inheritance and overloading, but you don't really need to know anything about the OO paradigm to benefit from EmbperlObject. As you will see from this short tutorial, it is possible to benefit from using EmbperlObject with even a minimal knowledge of Perl. With just a little instruction, in fact, pure HTML coders can use it to improve their Web site architecture. Having said that, however, EmbperlObject also provides for more advanced OO functionality, as we'll see later.

Getting Started

We'll assume that you've successfully installed the latest Apache, mod_perl and Embperl on your system. That should all be relatively painless - problems normally occur when mixing older versions of one tool with later versions of another. If you can, try to download the latest versions of everything.

Having done all that, you might want to get going with configuring a Web site. The first thing you need to do is set up the Apache config file, usually called httpd.conf.

Configuring httpd.conf

The following is an example configuration for a single virtual host to use EmbperlObject. There are, as usual, different ways to do this; but if you are starting from scratch, then it may be useful as a template. It works with the later versions of Apache (1.3.6 and up). Obviously, substitute your own IP address and domain name.

        NameVirtualHost 10.1.1.3:80
		
        <VirtualHost 10.1.1.3:80>
                ServerName www.mydomain.com
                ServerAdmin webmaster@mydomain.com
                DocumentRoot /www/mydomain/com/htdocs
                DirectoryIndex index.html
                ErrorLog /www/mydomain/com/logs/error_log
                TransferLog /www/mydomain/com/logs/access_log
                PerlSetEnv EMBPERL_ESCMODE 0
                PerlSetEnv EMBPERL_OPTIONS 16
                PerlSetEnv EMBPERL_MAILHOST mail.mydomain.com
                PerlSetEnv EMBPERL_OBJECT_BASE base.epl
                PerlSetEnv EMBPERL_OBJECT_FALLBACK notfound.html
                PerlSetEnv EMBPERL_DEBUG 0
        </VirtualHost>
		
        # Set EmbPerl handler for main directory
        <Directory "/www/mydomain/com/htdocs/">
                <FilesMatch ".*\.html$">
                        SetHandler  perl-script
                        PerlHandler HTML::EmbperlObject
                        Options     ExecCGI
                </FilesMatch>
                <FilesMatch ".*\.epl$">
                        Order allow,deny
                        Deny From all
                </FilesMatch>
        </Directory>

Note that you could change the .html file extension in the FilesMatch directive; this is a personal preference issue. Personally, I use .html for the main document files because I can edit files using my favorite editor (emacs) and it will automatically load html mode. Plus, this may be a minor thing - but using .html rather than a special extension such as .epl adds a small amount of security to your site since it provides no clue that the Web site is using Embperl. If you're careful about the handling of error messages, then there never will be any indication of this. These days, the less the script kiddies can deduce about you, the better ...

Also, note that we have added a second FilesMatch directive, which denies direct access to files with .epl extensions (again, you could change this extension to another if you like, for example, .obj). This can be helpful for cases where you have Embperl files that contain fragments of code or HTML; you want those files to be in the Apache document tree, but you don't want people to be able to request them directly - these files should only included directly into other documents from within Embperl, using Execute(). This is really a security issue. In the following examples, we name files that are not intended to be requested directly with the .epl extension. Files that are intended to be directly requested are named with the standard .html extension. This can also be helpful when scanning a directory to see which are the main document files and which are the modules. Finally, note that using the Apache FilesMatch directive to restrict access does not prevent us from accessing these files (via Execute) in Embperl.

So how does all this translate into a real Web site? Let's look at the classic example, Hello World.

Hello World

The file specified by the EMBPERL_OBJECT_BASE apache directive (usually called base.epl) is the lynchpin of how EmbperlObject operates. Whenever a request comes for any page on this Web site, Emperl will look for base.epl - first in the same directory as the request, and if it's not found there, then working up the directory tree to the root directory of the Web site. For example, if a request comes for http://www.yoursite.com/foo/bar/file.html, then Embperl first looks for /foo/bar/base.epl. If it doesn't find base.epl there, then it looks in /foo/base.epl. If there's still no luck, then it finally looks in /base.epl. (These paths are all relative to the document root for the Web site). What is the point of all this?

In a nutshell, base.epl is a template for giving a common look-and-feel to your Web pages. This file is what is used to build the response to any request, regardless of the actual filename that was requested. So even if file.html was requested, base.epl is what is actually executed. base.epl is a normal file containing valid HTML mixed with Perl code, but with a few small differences. Here's a simple 'Hello World' example of this approach:

/base.epl

        <HTML>
        <HEAD>
                <TITLE>Some title</TITLE>
        </HEAD>
        <BODY>
        Joe's Website
        <P>
        [- Execute ('*') -]
        </BODY>
        </HTML>

/hello.html

        Hello world!

Now, if the file http://www.yoursite.com/hello.html is requested, then base.epl is what will get executed initially. So where does the file hello.html come into the picture? Well, the key is the '*' parameter in the call to Execute(). '*' is a special filename, only used in base.epl. It means, literally, ``the filename that was actually requested.''

What you will see if you try this example is something like this:

        Joe's Website
        Hello world!

As you can see here, the text ``Joe's Web Site'' is from base.epl and the ``Hello world!'' is from hello.html.

This architecture also means that only base.epl has to have the boilerplate code that each HTML file normally needs to contain - namely the <HTML> <BODY>, </HTML> and so on. Since the '*' file is simply inserted into the code, all it needs to contain is the actual content that is specific to that file. Nothing else is necessary, because base.epl has all the standard HTML trappings. Of course, you'll probably have more interesting content, but you get the point.

Web Site Global Variables

Let's look at a more interesting example. When you create Perl variables in Embperl usually their scope is the current file; so they are effectively ``local'' to that file. When you split your Web site into modules, however, it quickly becomes apparent that it is useful to have variables that are global to the Web site, i.e., shared between multiple files.

To achieve this, EmbperlObject has a special object that is automatically passed to each page as it is executed. This object is usually referred to as the ``Request'' object, because we get one of these objects created for each document request that the Web server receives. This object is passed in on the stack, so you can retrieve it using the Perl ``shift'' statement. This object is also automatically destroyed after the request, so the Request object cannot be used to store data between requests. The idea is that you can store variables that are local to the current request, and shared between all documents on the current Web site; plus, as we'll see later, we can also use it to call object methods. For example, let's say you set up some variables in base.epl, and then use them in file.html:

/base.epl

        <HTML>
        <HEAD>
                <TITLE>Some title</TITLE>
        </HEAD>
        [- 
                $req = shift;
                $req->{webmaster} = 'John Smith'
        -]
        <BODY>
        [- Execute ('*') -]
        </BODY>
        </HTML>

/file.html

        [- $req = shift -]
        Please send all suggestions to [+ $req->{webmaster} +].

You can see that EmbperlObject is allowing us to set up global variables in one place and share them throughout the Web site. If you place base.epl in the root document directory, you can have any number of other files in this and subdirectories, and they will all get these variables whenever they are executed. No matter which file is requested, /base.epl is executed first and then the requested file.

You don't even need to include the requested '*' file, but typically you will have to - it would be a bit odd to ignore the requested file!

Modular Files

The previous example is nice; it demonstrates the basic ability to have Web site-wide variables set up in base.epl and then automatically have them shared by all other files. Leading on from this, we probably want to split up our files, for both maintainability and readability. For example, a non-trivial Web site will probably define some Web site-wide constants, perhaps some global variables, and maybe also have some kind of initialization code that has to be executed for each page (e.g. setting up a database connection). We could put all of this in base.epl, but this file would quickly begin to look messy. It would be nice to split this stuff out into other files. For example:

/base.epl

        <HTML>
        [- Execute ('constants.epl')-]
        [- Execute ('init.epl')-]
        <HEAD>
                <TITLE>Some title</TITLE>
        </HEAD>
        <BODY>
        [- Execute ('*') -]
        </BODY>
        [- Execute ('cleanup.epl') -]
        </HTML>

/constants.epl

        [-
                $req = shift;
                $req->{bgcolor} = "white";
                $req->{webmaster} = "John Smith";
                $req->{website_database} = "mydatabase";
        -]

/init.epl

        [-
                $req = shift;
                # Set up database connection
                use DBI;
                use CGI qw(:standard);
                $dsn = "DBI:mysql:$req->{website_database}";
                $req->{dbh} = DBI->connect ($dsn);
        -]

/cleanup.epl

        [-
                $req = shift;
                # Close down database connection
                $req->{dbh}->disconnect();
        -]

You can see how this would be useful, since each page on your site now has a database connection available in $req->{dbh}. Also notice that we have a cleanup.epl file that is always executed at the end - this is useful for cleaning up, shutting down connections and so on.

Modular File Inheritance

To recap, we have seen how we can break our site into modules that are common across multiple files, because they are automatically included by base.epl. Inheritance is a way in which we can make our Web sites more modular.

Although the concept of inheritance is one that stems from the object-oriented paradigm, you really don't need to be an OO guru to understand it. We will demonstrate the concept through a simple example.

Say you wanted different parts of your Web site to have different <TITLE> tags. You could set the title in each page manually, but if you had a number of different pages in each section, then this would quickly get tiresome. We could split off the <HEAD> section into its own file, just like constants.epl and init.epl, right? But so far, it looks like we are stuck with a single head.epl file for the entire Web site, which doesn't really help much.

The answer lies in subdirectories. This is the key to unlocking inheritance and one of the most powerful features of EmbperlObject. You may use subdirectories currently in your Web site design, maybe for purposes of organization and maintenance. But here, subdirectories actually enable you to override files from upper directories. This is best demonstrated by example (simplified to make this specific point clearer - assume constants.epl, init.epl and cleanup.epl are the same as in the previous example):

/base.epl

        <HTML>
        [- Execute ('constants.epl')-]
        [- Execute ('init.epl')-]
        <HEAD>
        [- Execute ('head.epl')-]
        </HEAD>
        <BODY>
        [- Execute ('*') -]
        </BODY>
        [- Execute ('cleanup.epl') -]
        </HTML>

/head.epl

        <TITLE>Joe's Website</TITLE>

/contact/head.epl

        <TITLE>Contacting Joe</TITLE>

Assume that we have an index.html file in each directory that does something useful. The main thing to focus on is head.epl. You can see that we have one instance of this file in the root directory, and one in a subdirectory, namely /contact/head.epl. Here's the neat part: When a page is requested from your Web site, EmbperlObject will search automatically for base.epl first in the same directory as the requested page. If it doesn't find it there, then it tracks back up the directory tree until it finds the file. But then, when executing base.epl, any files that are Executed (such as head.epl) are first looked for in the original directory of the requested file. Again, if the file is not found there, then EmbperlObject tracks back up the directory tree.

So what does this mean? Well, if we have a subdirectory, then we can see whether we want just the usual index.html file and nothing else. In that case, all the files included by base.epl will be found in the root document directory. But if we redefine head.epl, then EmbperlObject will pick up that version of the file whenever we are in the /contact/ subdirectory.

That is inheritance in action. In a nutshell, subdirectories inherit files such as head.epl, constants.epl and so on from ``parent'' directories. But if we want, we can redefine any of these files in our subdirectories, thus specializing that functionality for that part of our Web site. If we had 20 .html files in /contact/, then loading any of them would automatically get /contact/head.epl.

This is all very cool, but there is one more wrinkle. Let's say we want to redefine init.epl, because there is some initialization that is specific to the /contact/ subdirectory. That's fine since we can create /contact/init.epl and that file would be loaded instead of /init.epl whenever a file is requested from the /contact/ subdir. But this also means that the initialization code that is in /init.epl would never get executed, right? That's bad, because the base version of the file does a lot of useful set up. The answer is simple: For cases such as this, we need to make sure to call the parent version of the file at the start. For example:

/contact/init.epl

        [- Execute ('../init.epl') -]
        [-
                # Do some setup specific to this subdirectory
        -]

You can see that the first thing we do here is Execute the parent version of the file (i.e., the one in the immediate parent directory). Thus we can ensure the integrity of the basic initialization that each page should receive.

EmbperlObject is smart about this process. For example, we have a situation where we have several levels of subdirectory; then, say we only redefine init.epl in one of the deeper levels, say /sub/sub/sub/init.epl. Now, if this file tries to Execute ../init.epl, there may not be any such file in the immediate parent directory - so EmbperlObject automatically tracks back up the directories until it finds the base version, /init.epl. So, for any subdirectory level in your Web site, you only have to redefine those files that are specific to this particular area. This results in a much cleaner Web site.

You can break up your files into whatever level of granularity you want, depending on your needs. For instance, instead of just head.epl you might break it down into title.epl, metatags.epl and so on. It's up to you. The more you split it up, the more you can specialize in each of the subdirectories. There is a balance, however, because splitting things up too much results in an overly fragmented site that can be harder to maintain. Moderation is the key - only split out files if they contain a substantial chunk of code, or if you know that you need to redefine them in subdirectories, generally speaking.

Subroutines in EmbperlObject

There are two types of inheritance in EmbperlObject. The first is the one that we described in the previous section, i.e., inheritance of modular files via the directory hierarchy. The other type, which is closely related, is the inheritance of subroutines (both pure Perl and Embperl). In this context, subroutines are really object methods, as we'll see below. As you are probably already aware, there are two types of subroutines in Embperl, for example:

        [!
                sub perl_sub
                {
                        # Some perl code
                }
        !]
		
        [$ sub embperl_sub $]
                Some HTML
        [$ endsub $]

In EmbperlObject, subroutines become object methods; the difference is that you always call an object method through an object reference. For example, instead of a straight subroutine call like this:

        foo();

We have instead a call through some object:

        $obj->foo();

EmbperlObject allows you to inherit object methods in much the same way as files. Because of the way that Perl implements objects and methods, there is just a little extra consideration needed. (Note: This is not really a good place to introduce Perl's object functionality. If you're not comfortable with inheritance, @ISA and object methods, then I suggest you take a look at the book ``Programming Perl'' (O'Reilly) or ``Object Oriented Perl'' by Damien Conway (Manning).)

A simple use of methods can be demonstrated using the following example:

/base.epl

        [! sub title {'Joe's Website'} !]
        [- $req = shift -]
        <HTML>
        <HEAD>
        <TITLE>[+ $req->title() +]</TITLE>
        </HEAD>
        </HTML>

/contact/index.html

        [! sub title {'Contacting Joe'} !]
        [- $req = shift -]
        <HTML>
                A contact form goes here
        </HTML>

This is an alternative way of implementing the previous ``contact'' example, which still uses inheritance - but instead of placing the <TITLE> tag in a separate file (head.epl), we use a method (title()). You can see that we define this method in /base.epl, so any page that is requested from the root directory will get the title ``Joe's Web Site.'' This is a good default title. Then, in /foo/index.html we redefine the title() method to return ``Contacting Joe.'' Inheritance ensures that when the call to title() occurs in /base.epl, the correct version of the method will be executed. Since /foo/index.html has its own version of that method, it will automatically be called instead of the base version. This allows each file to potentially redefine methods that were defined in /base.epl, and it works well. But, as your Web sites get bigger, you will probably want to split off some routines into their own files.

EmbperlObject also allows us to create special files that contain only inheritable object methods. EmbperlObject can set up @ISA for us, so that the Perl object methods will work as expected. To do this, we need to access our methods through a specially created object rather than directly through the Request object (usually called $r or $req). This is best illustrated by the following example, which demonstrates the code that needs to be added to base.epl and shows how we implement inheritance via a subdirectory. Once again, assume that missing files such as constants.epl are the same as the previous example (Note that the 'object' parameter to Execute only works in 1.3.1 and above).

/base.epl

        <HTML>
        [- $subs = Execute ({object => 'subs.epl'}); -]
        [- Execute ('constants.epl') -]
        [- Execute ('init.epl') -]
        <HEAD>
        [- Execute ('head.epl') -]
        </HEAD>
        <BODY>
        [- Execute ('*', $subs) -]
        </BODY>
        [- Execute ('cleanup.epl') -]
        </HTML>

/subs.epl

        [!
                sub hello
                {
                        my ($self, $name) = @_;
                        print OUT "Hello, $name";
                }
        !]

/insult/index.html

        [-
                $subs = $param[0];
                $subs->hello ("Joe");
        -]

/insult/subs.epl

        [! Execute ({isa => '../subs.epl'}) !]

        [!
                sub hello
                {
                        my ($self, $name) = @_;
                        $self->SUPER::hello ($name);
                        print OUT ", you schmuck";
                }
        !]

If we requested the file /insult/index.html, then we would see something like:

        Hello, Joe, you schmuck

So what is happening? First, note that we create a $subs object in base.epl, using a special call to Execute(). We then pass this object to files that will need it, via an Execute() parameter. This can be seen with the '*' file.

Next, we have two versions of subs.epl. The first, /subs.epl, is pretty straightforward. All we need to do is remember that all of these subroutines are now object methods, and so take the extra parameter ($self). The basic hello() method simply says Hello to the name of the person passed in.

Then we have a subdirectory, called /insult/. Here we have another instance of subs.epl, and we redefine hello(). We call the parent version of the function, and then add the insult (``you schmuck''). You don't have to call the parent version of methods you define, of course, but it's a useful demonstration of the possibilities.

The file /insult/subs.epl has to have a call to Execute() that sets up @ISA. This is the first line. You might ask why EmbperlObject doesn't do this automatically; it is mainly for reasons of efficiency. Not every file is going to contain methods that need to inherit from the parent file, and so simply requiring this one line seemed to be a good compromise. It also allows for more flexibility, as you can include other arbitrary files into the @ISA tree if you want.

Conclusions

So there you have it: an introduction to the use of EmbperlObject for constructing large, modular Web sites. You will probably use it to enable such things as Web site-wide navigation bars, table layouts and whatever else needs to be modularized.

This document is just an introduction, to give a broad flavor of the tool. You should refer to the actual documentation for details.

EmbperlObject will inevitably evolve as developers discover what is useful and what isn't. We will try to keep this document up-to-date with these changes, but make sure to check the Embperl Web site regularly for the latest changes.

This Week on p5p 2001/03/12



Notes

You can subscribe to an email version of this summary by sending an empty message to perl5-porters-digest-subscribe@netthink.co.uk.

Please send corrections and additions to perl-thisweek-YYYYMM@simon-cozens.org where YYYYMM is the current year and month. Changes and additions to the perl5-porters biographies are particularly welcome.

There were 424 messages this week.

Pod Questions

As reported last week, Michael Stevens has been working away on attempting to make the core Perl documentation podchecker-clean, and has succeeded in stopping it from emitting any errors. However, he came up with quite a few weirdnesses. The most contentious was the correct way to write:

     L<New C<qr//> operator>

since L<> was seeing the slash and thinking it was a section/manpage separator. Russ Allbery said that the best way was

    L<"New C<qr//> operator">

but the problem with that is that the resulting reference gets quoted. And, in fact, podchecker was still unhappy with that. Russ said:

podchecker complains about all sorts of things that I consider to be perfectly valid POD, such as the use of < and > in free text to mean themselves when not preceeded by a capital letter. I think making podchecker smarter is the right solution.

But as Michael said, "the problem is finding a clear definition of what "smarter" actually is."

I also complained that

    =head2 New C<qr//> operator

was getting mangled by some parsers which didn't correctly restore boldface after the code section. The example I gave, pod2man, seemed to be due to a buggy set of roff macros.

Rob Napier came up with some truly excellent suggestions about the future of POD and how to make it more intuitive, and Russ tried to shoo people onto the pod-people mailing list for further discussion of what changes should be made.

Patching perly.y

Jeff Pinyan asked how one should go about patching the Perl grammar in perly.y; the answer, coming in three parts from myself, Peter Prymmer and Dan Sugalski, is:

1) Don't. You hardly ever need to.

2) Run make run_byacc which runs the byacc parser generator, and then applies a small patch to the resulting C file which allows dynamic memory allocation.

3) Run vms/vms_yfix.pl to patch up the VMS version of the parser.

4) CC perl-mvs@perl.org so that the EBCDIC people can prepare EBCDIC-aware versions of the parser.

CvOUTSIDE

Alan asked what CvOUTSIDE was for; it's another undocumented flag on a CV. Sarathy knows the answer, and it's scary:

Every CV that references lexicals from its outer lexical scopes needs to be able to access that outer scope's scratchpad at run time (via pp_anonsub(), cv_clone2() and pad_findlex()) to capture the lexicals that are visible at the time the cloning happens. In fact, all CVs need to have this whether they have outer lexicals referenced in them or not, given that eval"" requires visibility of the outer lexical scopes.

Hence, (I think) CvOUTSIDE is a pointer to the scratchpad of the outer lexical scope. Why is this important? Well, Alan's Great Subroutine Memory Leak (the problem with sub x { sub {} }) has come about because there's a reference count loop. As Sarathy explains:

The problem really is that there is a reference loop. The prototype anonymous sub holds a reference count on the outer sub via CvOUTSIDE(). The outer sub holds a reference count on the anonymous sub prototype via the pad entry allocated by OP_ANONCODE. The pad entry will be properly freed by op_clear() if it ever gets there, which it doesn't because of the loop.

Sarathy had a couple of attempts at fixing this, but hasn't managed to resolve it yet.

perlxstut Documentation

Vinh Lam reminded us that perlxstut is incomplete. Examples 6, 7, 8, and 9 are still not written. Does anyone out there want to write them?

EBCDIC and Unicode

With the assistance of Merijn Broeren and Morgan Stanley Dean Witter, I gained access to an EBCDIC mainframe and spent a happy day sanitizing the Unicode support on EBCDIC machines. As usual, there was some small argument over semantics, but the major change was that EBCDIC should be converted to ASCII before being upgraded to UTF8, and converted back to EBCDIC on degradation. Peter Prymmer seemed happy enough with what we'd been doing, and the patch went in. The patch, and its discussion, can be found here.

If you don't want to read the whole business, this is the important bit: much of the Unicode discussion this week centered on the vexed question of "What are v-strings for?". Here is the definitive answer from Larry.

PERL_DL_NONLAZY

Michael Schwern asked what the mysterious PERL_DL_NONLAZY environment variable was for - it's set on make test but never documented. He noted that as well as being used to alter the dynamic linking behaviour, it's used by some test suites to determine whether or not to produce additional information - almost certainly a misuse.

Paul Johnson explained that it passes a flag to dlopen which attempts to ensure that all functions are relocated as soon as the shared object is loaded. Sounds complicated? In the normal, "lazy" operation of the dynamic loader, the loader doesn't actually load all the functions from the library file into memory at one go - instead, it merely notices that it has a bunch more functions available; when a function is called, it loads up the appropriate part of the object into memory, and jumps to it. (Not entirely unlike the behaviour of use autouse or AutoSplit.)

Setting [PERL_DL_NONLAZY] forces the loader to load up all functions at once, so that it can ensure that it really does have code for all the functions it claims to have code for; this is usually what you want to do when testing.

Various

Sarathy fixed the "weird reset bug" of last week with a clever but untested patch; Chris Nandor dropped a bunch of good MacPerl protability patches. Ilya finally produced his rival UTF8 regular expressions patch, which Jarkko has been vigorously testing.

David Madison raised the my $var if $x bugbear again. Schwern's been cleaning up Test::Harness; good work as always, there. Robin Houston fixed a strange bug regarding my variables being cleared after a goto during a three-fingered for loop. Radu Greab fixed something strange with chop and arrays.

There was a small but pointless discussion of C coding styles, which concluded that you ought to leave off braces around single-statement blocks to if and the like if you can.

Tony Finch complained that use integer doesn't make rand return integers; Philip Newton provided a patch.

Congratulations to Raphael Manfredi, who spawned his first child process this week.

Until next week I remain, your humble and obedient servant,


Simon Cozens

Writing GUI Applications in Perl/Tk


"This article originally appeared in Visual Developer Magazine"

Perl is officially known as the "Practical Extraction and Report Language," in part because of its extremely robust text handling abilities. Perl's author, Larry Wall, has a much more colorful name for the language: the "Pathologically Eclectic Rubbage Lister." Many people are aware of Perl's role in the Web, specifically as an easy-to-use text processing language for writing CGI scripts. What many people don't realize, however, is that Perl is a powerful general purpose programming language that can be used to do general-purpose development-- including cross-platform GUI development with the Tk tool kit originally developed for the Tcl programming language under Unix. An important advantage of using the pTk (Perl/Tk) combination is that you can write truly portable cross-platform GUI applications-- applications that will work similarly across Win32, Macintosh, Linux, and even the AS/400!

In this article, I will introduce the basics of installing the Perl interpreter for Win32 and writing a visual application using the Tk (toolkit) modules. This system is geared toward the Win32 and Linux developers; however, most of the information presented pertains to other operating systems as well.

A Point-of-Sale Terminal in pTk

My expertise lies in electronic commerce. So when I decided to write this article, I naturally looked around to see what might be useful to others using the pTk system. One project I've worked on recently is integrating the PaymentNet credit-card processing client software into various applications. I decided that it would be great to have a system that could do all my credit-card processing from my local PC, for testing as well as for real e-commerce. So I created a pTk Point-of-Sale (POS) terminal program to do so. For many merchants, it is less expensive (and faster) to use the Internet to process their credit card (or check) transactions than to purchase additional software or terminals.

The terminal under Windows 95
Figure 1: The terminal under Windows 95.

The example system uses a simulated credit-card authorization module that does nothing except verify that the transaction could be a valid one. Real transactions can be implemented quite easily; I'll show you how at the end of the article. Of course, the code works with any pTk system. You can see it running under both Windows 95 (Figure 1) and Linux (Figure 2).

Figure 2: The terminal under Linux.
Figure 2: The terminal under Linux.

We will cover a lot of ground very quickly in this article, and of necessity, will gloss over quite a few very important points. Perl comes with some of the best documentation available in the form of POD (plain old documentation) files, and a search/viewing program called perldoc. Whenever you get stuck, chances are the answer to your questions will be right on your local computer. For now, you can find more information by typing perldoc perldoc from the command line-- once Perl is installed, of course! I'll remind you of this a couple of times.

First, we'll need to install both Perl and the Tk libraries. If you are using Linux, this will mean compiling the libraries as well, though that is not at all difficult. Then we'll look at what it takes to make a pTk program: program flow, geometry management, event and variable bindings, and the event loop. I'll then do a brief walk through the code, showing examples of various widget (object) use within the system. You should then be well on your way to being an expert pTk programmer!

Installing ActiveState Perl for Win32

To install the latest version of Perl for Win32, go to the ActiveState download page (www.activestate.com/Active-Perl/download.htm). For Windows 98 and Windows NT, there are no special instructions; download the latest version (APi517e.exe as of this writing), and run the self-installer by double clicking on the file. For Windows95, however, you need to make sure that you have DCOM installed (it's already installed on Win98 and NT machines). The DCOM files and installation instructions are available from the Microsoft's Web site at www.microsoft.com/com/dcom/dcom95/dcom1_3.asp.

Installing the Perl/Tk Modules with Perl Package Manager

ActiveState extended the standard CPAN modules (Comprehensive Perl Archive, which we will discuss shortly) with its own Perl Package Manager (PPM). The package manager makes installing and configuring the modules extremely easy. Note that the primary difference (other than user interface) between Perl's standard CPAN module and the PPM is that CPAN deals exclusively with source code, requiring that you compile and install both Perl and the relevant modules from a source code distribution. ActiveStates' package manager, on the other hand, deals with pre-compiled modules that install on the host platform without compiling.

Assuming your Perl has been installed properly, you should now be able to go to the DOS command prompt and run the Perl Package Manager by typing PPM. You can get summary help information by typing "PPM -h". To use the PPM, you must be connected to the Internet. More PPM documentation can be found at www.activestate.com/ppm. You can also (as always) type "perldoc PPM" to get the documentation that comes with the module distribution.

To download and install a single package, just type "install tk" from the PPM prompt. Be aware that the Tk module, a complete GUI development environment for Perl, is pretty big. The zip file on which the package is based is over 2MB in size, so the download can take a while, especially over a standard modem. Go ahead and install Tk, as it is required for the rest of this project. Also, if you have an older ActiveState Perl distribution (Perl5.004 or earlier), I suggest upgrading to the latest install now. If you don't, the example program may not run. Specifically, you will need Data::Dumper installed, which you can load by typing "install Data-Dumper" from the PPM prompt, if you don't already have it.

Installing Perl and Tk Under Linux

One of Perl's great strengths is that there are hundreds of individual modules available for the language. All current builds of Perl come with a module-manager to be used with the Comprehensive Perl Archive Network (CPAN). Currently, CPAN lists over 260MB of source code! Windows users can also use CPAN if they've taken the time to compile their own version of Perl, rather than using ActiveState's binary distribution.

For my Linux desktop, I use the K Desktop Environment (KDE) that comes with the Caldera Open Linux 2.2 distribution. To run the system, make sure that you have both Perl and the gcc development environment installed. The following minimal packages are needed to install and run the system:

  • Perl
  • egcs
  • glibc
  • Xfree86-devel (For X-Windows header files)

To verify that the packages are installed (or to install them if they aren't):

  • Make sure that the Open Linux CD is in the drive.
  • Log in as "root".
  • Click on the icon that looks like a house, labeled "Caldera Open Administration System" (COAS).
  • Select "Software".
  • When the window comes up, select "Workstation | Administration | Software".
  • Make sure that "kpackage" is checked.

This will install the K package manager if it isn't already installed. Now click "K | Utilities", and then select "kpackage". The kpackage tree structure is a little easier to use and provides more information than the COAS, though they both manage distribution packages. If you are missing one of the packages above, simply select new from within the kpackage tree, select the package you want to install, and then click "examine" in the bottom right panel. A new dialog will pop up, allowing you to install the selected package.

Installing and Building Tk Using CPAN

To install and build the Tk libraries and Perl interface, it is necessary to execute the CPAN module. Open a command window (while still logged in as root), and execute the following command:

perl -MCPAN -e shell;

Perl will load the module CPAN (-MCPAN), and execute (-e) the shell subroutine contained in the module. If this is your first time executing the script, it will ask you configuration questions. If you are unsure of the answers, just hit Enter. In most cases, the defaults are appropriate for your system. Eventually, you will be dropped into a "cpan>" prompt. At this point, you can type "h" for help, or go directly to the next step. To finish configuration, there are a few optional modules you should install just to make things a little easier. (If you are on a slow connection, you might want to skip installing the optional modules steps.)

Install Internet communications modules (optional):

cpan> install Bundle::libnet

Upgrade CPAN (optional):

install Bundle::CPAN
reload CPAN

Now, build and install Tk:

install Tk

This should download, build, and install the Tk packages from the Internet. It's not really any more difficult from using the PPM; however, CPAN does require that a C compiler be installed, which isn't the norm for Windows users.

Now that both Perl and Tk have been installed in your development environment, spend some time exploring the system. The toolkit comes with a few extremely useful example programs. A simple text editor called ptked is available, as well as ptksh (a GUI shell where you can experiment creating forms and other controls directly within the pTK environment, which is reminiscent of the BASIC command line interface). The widget program is a comprehensive demonstration and testing shell for-- you guessed it-- widget exploration. It's a really fun tool to play with, one that can get you into programming with pTk very quickly. These utilities are available on all systems with pTk installed.

pTk Core Concepts

To understand a pTk program, we need to explore a few core concepts regarding the system design. In particular, you need to be familiar with the structure of the program, the way objects (widgets) are laid out on the screen (geometry management), the way communication is handled (variable and event binding), and the pTk event loop.

All pTk programs are assembled in pretty much the same way:

Create a main window, which is also known as the top-level window. Build a group of widgets (Unix-speak for controls), and arrange them inside the main window. In Perl, a widget is simply an object that contains data and methods to create some visible element of the user interface. Start the event loop. Events are then fired, and handled by the widgets and associated code.

Geometry Management

In a fashion similar to Java, Tk has a notion of "geometry management," which is a fancy way of saying that the software wants to decide where to put your controls for you. Basically, you define the widget, and then tell the system about where you'd like it to go (top, bottom, etc.), and it takes care of sizing and placing the widget for you. We will use the "packer" and "grid" exclusively; however, others are available for different layouts. Since each frame can have its own geometry manager, extremely sophisticated placement schemes can be created using this system.

I've used the grid() manager to divide the example application into four frames: $filemenu (at the top), $left, $right, and $bottom. The grid manager works like HTML tables: you can specify the row and column for each widget, as well as a columnspan and rowspan, if necessary. The "sticky" option tells the manager where to "attach" the widget, the options being north, south, east, west in any combination. Specifying all four will center your widget within its frame.

my $left  = 
 $mainwindow->Frame->grid(-row => 0, -col => 0, -sticky => 'nw');

Keep in mind that a widget won't be displayed until its geometry manager is called. This can be useful to keep controls hidden until you're ready to use them; however, it can also be a source of errors. If you have created a widget but can't see it on the screen, chances are you've forgotten to set its manager.

Binding Variables

Variable binding is one of the most frequently used data input/output mechanisms available to pTk programs. The concept is relatively simple: a variable is bound to a control, and when the state of that control changes, the variable is updated to reflect the new state. The reverse is true as well: changing the state of the variables within the program automagically updates the associated controls. You can see examples of this throughout the source code. Note how saveConfig just dumps the state of the $config hash, and loadConfig does the opposite; the NoteBook control is updated to reflect the state changes without additional work. This is usually accomplished by passing the -variable or -textvariable option to the widget, as well as a reference to the scalar variable you want bound:

$left->Optionmenu (-options => \@trxtype, 
   -variable => \$trans->{TRXTYPE} )->pack();

This statement binds the variable $trans->{TRXTYPE} to a select box. Whenever you update the transaction type on the program, the variable changes to reflect the change. Note that the geometry manager (pack() in this case), is called to place the widget on its associated frame and to make it visible. This is a pattern that you will use quite a bit throughout pTk programming, and is central to the event model.

Binding Events

Most action widgets have an optional parameter, "command," that allows you to specify the function that should be called whenever this widget is acted upon in some way. This is a reference to a callback function that will be called when the widget receives an action event-button click, scrollbar release, etc.

$bottom->Button(-text => 'Process Transaction', 
        -command => \&processTransaction )
        ->grid(qw/-row 2 -column 0 -sticky nesw/);

In this case, the processTransaction function is called whenever the button is pressed. For most standard programs, this is the extent of the event management required, when combined with tied variables described in the previous section.

It is also possible to bind additional events to subroutines by using the bind() call. The general format for doing so is:

$widget->bind(event, subroutine);

When an event related to your widget fires, the subroutine that has been bound to the command is called. Possible bound events include key clicks and releases, mouse motion, and window resizing. More information can be found using perldoc Tk:bind.

The Event Loop

Once everything is set up, the only thing left to do is call MainLoop() from within the program. This goes into an endless loop, dispatching events from the underlying operating system to your applications and updating appropriate bound variables.

MainLoop();  # never returns

Just as in the pre-Win32 days when we used the cooperative multitasking of Windows 3.1 and text-based frameworks like Turbo Vision and other non-multitasking systems, you must break up any CPU-intensive tasks into manageable chunks by using the after() or repeat() method of most widgets, which are timers used to schedule events. Other methods may work equally well.

This time-slicing problem crops up when doing a real transaction over the Internet. The system seems to "freeze" for a few seconds while processing the data, as the call to do so blocks until the data has been received. Unfortunately, this is normal behavior as the current release version of Perl is not yet threaded. However, the newest development version of Perl does incorporate a threaded architecture, so you can expect this limitation to be removed shortly. I noticed that the Perl shipped with the Caldera Open Linux 2.2 distribution has been compiled multi-threaded, though I haven't tried using the new threading features yet.

Creating the Application

Dozens of widgets are available for use in your programs. For our application, I only used a few: filemenu, Optionmenu (a drop-down select box), LabEntry (a text box with related label), and NoteBook, for the configuration control.

If you look through the source code, you will see the general outline of the way things are done in most pTk applications. First, I declare all the global structures that I will use. This is important, as they will need to be within the scope of the callback routines used. (See Listing 1.) I then declare a MAIN block for the program. (See Listing 2.) Within the main program, we first create the main window ($mainwindow). Every widget will attach (at some level) to this window:

$mainwindow = MainWindow->new();

Nothing fancy here. It's best to keep your handle in a global variable, as most of the program will need access to it.

Creating the MenuBar

We then create a frame for the menu bar. A frame keeps parts of the program together logically, and for geometry management, lays out your controls on the screen. If you've used Java, the concept will be familiar. Before a widget can be seen on the screen, you must call a manager to lay it out for you. Each frame can only have one type of manager for its widgets; however, each frame can have multiple frames, making possible quite complex schemes.

The $menubar consists of a single menu button, $filemenu. Of course, more are possible in larger applications. Attached to $filemenu are the commands you will see in the drop-downs, each specifying the label and a command (that is, a bound subroutine.) Additional options within the menu bar widget allow you to attach colors, checkboxes, and hotkeys to the menu items, as well.

Menus under pTk are lists of actions with associated commands, usually contained within a frame. First, you need to create the menu bar frame:

$menubar = 
 $mainwindow->Frame()->pack(-side => 'top', -fill => 'x');

After creating the frame, we'll add one button to it:

my $filemenu = $menubar->Menubutton(-text => 'File');

Multiple top-level items would be added the same way. Now we need to add a few commands to the menu:

$filemenu->command( -label  => 'Open Config',
                 -command => \&loadConfig );
$filemenu->command( -label  => 'Save Config',
                 -command => \&saveConfig );
$filemenu->separator();
$filemenu->command(-label  => 'Configuration...',
                 -command => \&doConfig );
$filemenu->separator();
$filemenu->command(-label => 'Exit', -command => sub {exit;} );

Note the way commands are bound to the widgets. Whenever a user selects an item, the bound command is executed.

Building the Main Window

Within the main window, I've added three additional frames in addition to the menu bar: one on the left, for data input; one on the right, for data output; and the third on the bottom for status lines and the "Process Transaction" buttons.

my $left  = $mainwindow->Frame->grid(-row => 1,
                    -col => 0,
                    -sticky => 'nw');
my $right = $mainwindow->Frame->grid(-row => 1,
                    -col => 1,
                    -sticky => 'nw');
my $bottom = $mainwindow->Frame->grid(-row => 2,
                    -col => 0,
                    -columnspan => 3,
                    -sticky => 'nw');

The left frame is first populated with a drop-down list (called an Optionmenu), with the various transaction types available:

$left->Optionmenu(
        -options => \@trxtype,
        -variable => \$trans->{TRXTYPE},
        )->pack(-side => "top", -anchor => "nw");

The -options parameter is a reference to an array (defined globally; refer to Listing 1) of accepted transaction types. We've seen this before when talking about bound variables. Remember that the variable $trans->{TRXTYPE} will always reflect the state of the Optionmenu, and we can change the current selected item programmatically by simply reassigning a value to the bound variable. We also create an Optionmenu for the card-type; however, it isn't used in the current implementation. The card type can be determined from the actual card number, so isn't really required except that users are used to having it available.

We then populate the rest of the frame with LabEntry widgets. These little gizmos are fantastic, housing commonly used labels, entry fields, and bound variables in one convenient widget.

$left->LabEntry(-label => "Name",
   -labelPack => [-side => "right", -anchor => "w"],
   -width => 20,
   -textvariable => \$trans->{_NAME})->pack(-side => "top", 
   -anchor => 'nw');

Like every other widget, you must call the geometry manager to make it visible. It is also pretty obvious that the LabEntry widget is calling its own geometry manager behind the scenes for us.

The right-hand frame is populated with a series of Label widgets for displaying the output of the transaction. Note the grid manager used to place the labels in a column. I've also made sure to bind the second Label in each row to a variable, to be used for updating the state of the transaction later.

$right->Label(-text => 'PNRef')->
        grid(-row => 0, -column => 0,-sticky => 'nw');
$right->Label(-textvariable => \$results->{PNREF})->
        grid(-row => 0, -column => 1,-sticky => 'nw');

The bottom frame is no more complicated, and consists of only three widgets: two LabEntry widgets for showing the state of the transaction (both sent and received) and a button to actually perform the processing.

$bottom->Button(-text => 'Process Transaction',
         -command => \&processTransaction )->
         grid(qw/-row 2 -column 0 -sticky nesw/);

The button is bound to the processTransaction routine, which is straightforward. (See Listing 3.) The only thing left to do is to start the event processing, which should look familiar by this time:

MainLoop();  # Start the event processing

Saving and Loading the Configuration

The system binds "File | Open Config" and "File | Save Config" to the loadConfig() and saveConfig() subroutines, respectively. To configure the file dialog boxes, each routine needs a list of the file types to accept:

my @types = (["Config Files", '.pcg', 'TEXT'],
       ["All Files", "*"] );

This creates the standard "Save As" dialog box. (See Figure 1.) You can add as many file types as you'd like. To actually execute the dialog, simply call getSaveFile:

my $file = $mainwindow->getSaveFile(-filetypes => \@types,
      -initialfile=> 'Default',
      -defaultextension => '.pcg');
Figure 3: The Save As dialog.
Figure 3: The Save As dialog.

You can specify the initial file name for the dialog, as shown above. The name of the file the user selected is returned, or an undef value if the "cancel" button was pressed. If the file exists, a second confirmation dialog ("Are you sure you want to overwrite?") is executed. Only if the user answers affirmatively will the filename be returned. Using the Open dialog is similar, just call getOpenFile instead of getSaveFile.

Once we have the filename, we can then save and load the configuration. We are using an extremely simple file format, basically a mini-Perl program, thanks to the Data::Dumper module. This is where the interpreted nature of Perl shines. To write out the file, we open it, print the text provided by Dumper, and then close it. Reading it back in is simple as well. Slurp up the file, and then use the eval function to interpret the file. Be aware that this format stores all configuration information in plain text on the user's drive, so it may not be suitable for use in an open environment.

The Configuration Dialog

Figure 4: The Configuration dialog
Figure 4: The Configuration dialog

One really cool widget in the pTk system is the NoteBook (see Figure 4). I've used it to implement the simple configuration dialog bound to the "File | Configuration" menu item. The doConfig() callback subroutine actually implements the system. The strategy is to create a structure ($config) to hold all the relevant information. Within doConfig, we first copy this structure to a local structure, so that changes won't be saved if the user selects "Cancel" when editing the fields. To create the actual dialog, we ask $mainwindow to do it for us:

$f = $mainwindow->DialogBox(-title => "Configuration", 
              -buttons => ["OK", "Cancel"]);

To add the NoteBook widget to the DialogBox (which is really just a fancy Frame), use:

$n = $f->add('NoteBook', -ipadx=>6, -ipady => 6) 

The -ipad options tell the system how much internal padding to leave around the widgets contained within the box. To actually use the NoteBook, you must now add pages to it:

my $vendor_p = $n->add("vendor", 
         -label => "Vendor ID", -underline => 0);
my $host_p  = $n->add("host", 
         -label => "Host", -underline => 0); 
my $proxy_p = $n->add("proxy", 
         -label => "Proxy", -underline => 0);

This adds our three pages, named "vendor," "host," and "proxy." You can treat each page as a regular frame within your dialog. The rest of the method's code sets up the entry boxes we need:

$host_p->LabEntry(-label => "Port:       ", 
         -labelPack => [-side => "left", 
         -anchor => "w"], 
         -width => 20, 
         -textvariable => 
         \$localconfig->{PORT})->pack(-side => "top",
         -anchor => "nw");

Again, we use the LabEntry widget to make a labeled text-entry box. To execute the dialog, once it's set up, is a one liner:

$return = $f->show;

The return value is the text of the button that the user pressed to end the dialog. If the result matches, we can then copy our temporary variables back to the $config array, effectively updating our configuration.

Using the System

That just about covers the example program, which is available electronically in the listings archive for this issue. To run the program under Windows, double-click on the file name (ptkpos.pl) from Windows Explorer. The ActiveState installation program for Windows associates the .pl extension to Perl by default and runs the Perl interpreter when you double click on a .pl file. Under Linux, you first need to make the program executable.

chmod +x ptkpos.pl

You can then use the K file manager and click on the program, or execute it from the command line directly.

Live Online Credit Card Authorizations

The virtual terminal presented here is using a "stub" simulation module to do credit-card authorizations. This module simulates the transaction process by providing data that you could expect to receive if you were to send the data to a real validation system. The simulator is very basic: It does no checking on vendor ID / password, card expiration date or even ensuring that all required information is available. It does, however, check to see if the credit card could be a valid card: if so, it returns an authorization code, otherwise it will be declined. Any simulator declines any transaction type other than an "authorization" or a "sale."

To use the application for real transactions, you must replace the stub with an active link to a real online gateway system. The program will automatically detect a file named call_pfpro.pl, re-evaluate it to replace the simulator, and transactions will then go to a real gateway.

To configure the system, select File | Configuration from the menu bar. Fill in the User and Password fields as provided by PaymentNet, or leave it blank to use the simulator. For testing, leave test.paymentnet.com in the Host entry. Port 443 is the default and should never need to be changed. If you are processing transactions from behind a firewall (always a good idea when doing electronic commerce) you need to set the Proxy settings. Consult the PaymentNet documentation for full details on how to do this.

The application supports only a limited subset of the PaymentNet system. In particular, PaymentNet supports various forms of check processing. It should be possible to support other payment gateways relatively easily, though doing so is not currently planned.

You can find more information (as well as the payment-transaction module) on how to do so at the Commerce-Store.com Web site, www.commercestore.com/developers/VDM.

Further Exploration

We've looked at the pTk event model, a couple of the more common widgets, and as well as a slice of Internet-enabled electronic commerce. As Internet access becomes more common, tools that access the Net from a desktop can inexpensively replace more traditional systems, like leased-line or dial up credit-card terminals, as we've done here.

There are quite a few exploration tools available for both Perl and pTk. I encourage you to take a look at some of the Web sites presented here, as well as the example programs that come with the distribution.

Nick Temple is an entrepreneur who has recently relocated to Silicon Valley to pursue the startup dream. The founder of The Temple Group, Ltd. and CommerceStore.com, he welcomes open discussion directed to ntemple@commercestore.com.

Additional information not part of original article:

Since the article was written, a number of things have changed in the online payments world. PaymentNet has changed their name to Signio, and then purchased by Verisign. You may know them now as Verisign Payment Services . Here are instructions for "upgrading" the pTkPOS system presented in the article to handle live Internet transactions. Here are the steps that you will need to follow:

  1. Download and install the source code to the article
  2. Click here to register with Signio for a test account. It is free, and can be converted to a live account at a later date, if you so choose. This may take up to an hour to activate. <plug> Note that CommerceStore.com (my company) is a reseller for Verisign Payment Services, so if you wish to purchase the service I'd appreciate the business. </plug>
  3. Download the latest version 2of the VPS client. You must log into your VPS manager at https://manager.signio.com. You will need to install pfpro.exe (for NT) or pfpro (for Linux) into the same directory as the article source code.
  4. Copy & paste the call_pfpro.pl code below into a file named call_pfpro.pl, in the same directory that you've installed the articles source code. Modify line 7 by changing it to the full path name of your pfpro binary.
  5. Follow the instructions in the article to install & configure Perl and the application. If you don't have a copy of the magazine, it can be found at your favorite magazine retailer. The complete text will be made available online late January, 2000. If you cannot find the magazine at your local retailer, please e-mail support@commercestore.com so that we can set you up with one.
  6. call_pfpro.pl 
    
    ################
    # call_pfpro #
    ################
    # Verisign Module
    # NOTE: Modify the following line by using the full path to your PFPRO
    #       executable file = pfpro on Linux or pfpro.exe on NT.
    
    my $pfpro = "pfpro";
    my $id ="0";
    sub call_pfpro {
      my $host         = shift;
      my $port         = shift;
      my $data         = shift;
      my $timeout      = shift;
      my $proxyaddress = shift;
      my $proxyport    = shift;
      my $proxylogin   = shift;
      my $proxypass    = shift;
      
      my $cmd = "$pfpro $host $port $data $time-out $proxyaddress $proxyport $proxylogin $proxypass";
      return `$cmd`;
    }
    
    1;
    

    What Next?

    You should now be able to play around with live transaction processing, albeit in test mode. Look around the rest of the CommerceStore.com website - we offer assistance obtaining Internet enabled merchant accounts, E-Commerce products and services, as well as consulting. Be sure to mention that you saw the article on Perl.com.

This Week on p5p 2001/03/05



Notes

You can subscribe to an email version of this summary by sending an empty message to perl5-porters-digest-subscribe@netthink.co.uk.

Please send corrections and additions to perl-thisweek-YYYYMM@simon-cozens.org where YYYYMM is the current year and month.

We're trying something new this week, and providing brief biographies of some of the more prolific porters. If you have anything to add to this, or object to something I've said about you, email me as above.

Locale Support

Andrew Pimlott filed a bug report related to POSIX::tolower; basically, it is not as locale-aware as Perl's own lc. He also found that lc failed to be locale-aware while in the debugger. Nick Ing-Simmons pointed out that use locale is lexically scoped, and the debugger is in a different scope, meaning that it won't pick up on the pragma. Andrew thought this was probably a bug (the debugger not assuming the debuggee's scope) but it is unclear as to how that could be fixed. Nick wondered if the first was due to not calling setlocale, but Andrew reported that this didn't help anything.

Andrew then went digging around in POSIX.pm and found that isalpha is perfectly locale aware, but tolower is not - this is because isalpha is written in XS, but tolower simply calls lc; because it's in a different lexical context, it doesn't pick up on the use locale. It transpires that XS code does execute in the same lexical context as the caller, which is quite strange. Andrew pointed out that there's no way to make a pragma dynamically scoped. He said:

I think this raises some fundamental issues, but I'm not sure exactly which. It seems clear that one would like to be able to write a correct tolower (ie, exactly equivalent to lc, as per the POSIX documentation) in pure Perl. One possibility is a TCL-like "uplevel", but I desperately hope that doesn't turn out to be the best option.

I asked why POSIX functions were being implemented in Perl instead of C, and Andrew replied that in some cases it already works merely by magic: Perl doesn't correctly turn on and off locale support lexically, so some functions inherit the support for free. Jarkko grumbled about locales in his customary manner, and said he'd take a look at the areas which needed setlocale calling, but then Andrew had a revelation:

Hmm, looks like I missed an essential point: locale support is not dependent on 'use locale' at all! This seems to be intentional. perl unconditionally calls setlocale() on startup, and never calls it again (unless you use POSIX::setlocale() explictly). So POSIX::isalpha() respects $LANG by default, even if you never mention the locale pragma.

It is only where (core) perl has a choice between calling a locale-sensitive libc function, and doing things its own way (eg, hard-coding character semantics), that the locale pragma currently matters. Since the string value of $! requires calling a locale-sensitive function (strerror()), $! always respects locale.

And you wonder why Jarkko throws his hands in the air at the mention of anything locale-related...

Coderef @INC

Nicholas Clark provided a patch which extended the little-known coderef-in- @INC feature to allow passing an object; if you pass an object instead of a coderef, the INC method will be called on it. This has allowed him to create an experimental pragma, ex::lib::zip which lets you put a module tree inside a ZIP archive and Perl will extract the modules it needs from it.

He then also explained what it was all about, in the hope that someone would write some proper documentation. Nobody did so, (my fault, I promised to but didn't get around to it) but his extremely helpful explanation of the coderef-in- @INC API, and the cheap source filter API it allows can be found here.

Briefly, you can do

    BEGIN {
        push @INC, \&somesub;
        sub somesub {
            my ($owncoderef, $wanted_file) = @_;
            # Produce a filehandle somehow
            if ($got_a_handle) {
                return *FH;
            } else {
                returm undef;
            }
        }
    }

and have your subroutine intercept calls to use. The ByteCache module on CPAN makes use of this to cache just-in-time compiled versions of modules.

More Memory Leak Hunting

Some people started complaining about the lateness of 5.6.1, and Alan mentioned the he probably wouldn't be able to ship 5.6.1 in Solaris yet because of what he saw as "the large number of leaks". Sarathy disagreed:

I strongly suspect you'll end up shipping no version of Perl with Solaris, then. Every single version of Perl out there has more unfixed "leaks" than 5.6.1-to-be, and some "real" ones to boot.

I say "leaks" because these are still totally hypothetical, given your vantage point of a -DPURIFY enabled build with the arena cleanup repressed (which is the right vantage point for someone who has set out to clean up all leaks, I should add). However, this is not the real world. In the real world, the arena cleanup is enabled and appears to do its thing (however ugly you or I say it is).

There was then some similarly ugly debate about what actually constitutes a leak: Alan considered a leak anything which allocated memory and lost the pointer to it; Sarathy was only considering monotonic process growth. Nick Clark suggested that he could trigger a leak in the Sarathy sense by repeatedly useing modules and clearing out %INC, but this wasn't the case; the leak is due to some ugly fakery that goes on when the compiler sees use Module;. Alan could, however, trigger a leak with

    sub X { sub {} }

as the inner subroutine wasn't being properly reference counted. Alan and I scrambled around looking for the use leak, and Alan found that the two were related. I'm not aware as to whether or not he's fixed it.

In other news, Nicholas Clark is a wicked, wicked man and managed to compile the Boehm Garbage Collector into Perl. (As Randal pointed out off-list, no more Boehm garbage!) and found a way to use it as a memory leak detector.

Weird Memory Corruption

Weird bug of the week came from Jarkko, who found that

    $ENV{BAR} = 0;
    reset;
    if (0) {
      if ("" =~ //) {
      }
    }

caused all kinds of merry hell - on some platforms, it ran fine, on some it segfaulted; the problems were not consistent across platforms, meaning that some machines with identical setups produced differing results. This is obviously maddening. Nick Clark made it even weirder:

./perl will pass. /usr/local/bin/perl will SEGV. They are byte for byte identical.

Jarkko thought this was a recent problem, but Nick managed to reproduce it in 5.005_02. Alan produced an impressive explanation of what was going on, which I greatly encourage you to read if you want to learn how to track this sort of thing down, but stopped short of an actual fix.

There was, of course, the usual discussion of how useless reset was anyway, including one suggestion of rewriting the op in pure Perl.

Yet More Unicode Wars

87 messages this week were spent attempting to formulate a sensible and acceptable Unicode policy. The attempt failed. If you really want to jump in and have a look, this is as good a place to start as any.

Switch is broken

Jarkko reported that for some reason, Switch 2.01 from CPAN has suddenly started failing tests on bleadperl. It would be really, really, really great if someone out there could look into why this is happening and try to come up with an isolated bug report. Or even better, fix it.

Various

Olaf Flebbe chimed in a bunch of EPOC fixes, for those of you running Perl on your Psions; Sarathy fixed a long-standing parser bug. Michael Stevens did some sterling work clearing up the POD markup of the documentation. Craig Berry turned in some updates to VMS's configure.com. Daniel Stutz and Ed Peschko both rewrote perlcc.

David Mitchell deserves an honourable mention for a really useful first patch, which lets perl -Dt tell you which variables are being accessed, as well as another debug option, -DR which tells you the reference counts of SVs in the stack. Very cool stuff, David, thanks.

Someone reported that ExtUtils::Install is naughty and doesn't check the return values of File::Copy::copy; this would be easy enough to fix up if anyone out there is interested. (That's bug ID 20010227.005, by the way)

Until next week I remain, your humble and obedient servant,


Simon Cozens
Visit the home of the Perl programming language: Perl.org

Sponsored by

Monthly Archives

Powered by Movable Type 5.13-en