Recently in Web Development Category

Using Amazon S3 from Perl

Data management is a critical and challenging aspect for any online resource. With exponentially growing data sizes and popularity of rich media, even small online resources must effectively manage and distribute a significant amount of data. Moreover, the peace of mind associated with an additional offsite data storage resource is invaluable to everyone involved.

At SundayMorningRides.com, we manage a growing inventory of GPS and general GIS (Geography Information Systems) data and web content (text, images, videos, etc.) for the end users. In addition, we must also effectively manage daily snapshots, backups, as well as multiple development versions of our web site and supporting software. For any small organization, this can add up to significant costs -- not only as an initial monetary investment but also in terms of ongoing labor costs for maintenance and administration.

Amazon Simple Storage Service (S3) was released specifically to address the problem of data management for online resources -- with the aim to provide "reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites." Amazon S3 provides a web service interface that allows developers to store and retrieve any amount of data. S3 is attractive to companies like SundayMorningRides.com as it frees us from upfront costs and the ongoing costs of purchasing, administration, maintenance, and scaling our own storage servers.

This article covers the Perl, REST, and the Amazon S3 REST module, walking through the development of a collection of Perl-based tools for UNIX command-line based interaction to Amazon S3. I'll also show how to set access permissions so that you can serve images or other data directly to your site from Amazon S3.

A Bit on Web Services

Web services have become the de-facto method of exposing information and, well, services via the Web. Intrinsically, web services provide a means of interaction between two networked resources. Amazon S3 is accessible via both Simple Object Access Protocol (SOAP) or representational state transfer (REST).

The SOAP interface organizes features into custom-built operations, similar to remote objects when using Java Remote Method Invocation (RMI) or Common Object Resource Broker Architecture (CORBA). Unlike RMI or CORBA, SOAP uses XML embedded in the body of HTTP requests as the application protocol.

Like SOAP, REST also uses HTTP for transport. Unlike SOAP, REST operations are the standard HTTP operations -- GET, POST, PUT, and DELETE. I think of REST operations in terms of the CRUD semantics associated with relational databases: POST is Create, GET is Retrieve, PUT is Update, and DELETE is Delete.

"Storage for the Internet"

Amazon S3 represents the data space in three core concepts: objects, buckets, and keys.

  • Objects are the base level entities within Amazon S3. They consist of both object data and metadata. This metadata is a set of name-attribute pairs defined in the HTTP header.
  • Buckets are collections of objects. There is no limit to the number of objects in a bucket, but each developer is limited to 100 buckets.
  • Keys are unique identifiers for objects.

Without wading through the details, I tend think of buckets as folders, objects as files, and keys as filenames. The purpose of this abstraction is to create a unique HTTP namespace for every object.

I'll assume that you have already signed up for Amazon S3 and received your Access Key ID and Secret Access Key. If not, please do so.

Please note that the S3::* modules aren't the only Perl modules available for connecting to Amazon S3. In particular, Net::Amazon::S3 hides a lot of the details of the S3 service for you. For now, I'm going to use a simpler module to explain how the service works internally.

Connecting, Creating, and Listing Buckets

Connecting to Amazon S3 is as simple as supplying your Access Key ID and your Secret Access Key to create a connection, called here $conn. Here's how to create and list the contents of a bucket as well as list all buckets.

#!/usr/bin/perl

use S3::AWSAuthConnection;
use S3::QueryStringAuthGenerator;

use Data::Dumper;

my $AWS_ACCESS_KEY_ID     = 'YOUR ACCESS KEY';
my $AWS_SECRET_ACCESS_KEY = 'YOUR SECRET KEY';

my $conn = S3::AWSAuthConnection->new($AWS_ACCESS_KEY_ID,
                                      $AWS_SECRET_ACCESS_KEY);

my $BUCKET = "foo";

print "creating bucket $BUCKET \n";
print $conn->create_bucket($BUCKET)->message, "\n";

print "listing bucket $BUCKET \n";
print Dumper @{$conn->list_bucket($BUCKET)->entries}, "\n";

print "listing all my buckets \n";
print Dumper @{$conn->list_all_my_buckets()->entries}, "\n";

Because every S3 action takes place over HTTP, it is good practice to check for a 200 response.

my $response = $conn->create_bucket($BUCKET);
if ($response->http_response->code == 200) {
    # Good
} else {
    # Not Good
}

As you can see from the output, the results come back in a hash. I've used Data::Dumper as a convenient way to view the contents. If you are running this for the first time, you will obviously not see anything listed in the bucket.

listing bucket foo
$VAR1 = {
          'Owner' => {
                     'ID' => 'xxxxx',
                     'DisplayName' => 'xxxxx'
                   },
          'Size' => '66810',
          'ETag' => '"xxxxx"',
          'StorageClass' => 'STANDARD',
          'Key' => 'key',
          'LastModified' => '2007-12-18T22:08:09.000Z'
        };
$VAR4 = '
';
listing all my buckets
$VAR1 = {
          'CreationDate' => '2007-11-28T17:31:48.000Z',
          'Name' => 'foo'
        };
';

Writing an Object

Writing an object is simply a matter of using the HTTP PUT method. Be aware that there is nothing to prevent you from overwriting an existing object; Amazon S3 will automatically update the object with the more recent write request. Also, it's currently not possible to append to or otherwise modify an object in place without replacing it.

my %headers = (
    'Content-Type' => 'text/plain'
);
$response = $conn->put( $BUCKET, $KEY, S3Object->new("this is a test"),
                        \%headers);

Likewise, you can read a file from STDIN:

my %headers;

FILE: while(1) {
    my $n = sysread(STDIN, $data, 1024 * 1024, length($data));
    if ($n < 0) {
        print STDERR "Error reading input: $!\n";
        exit 1;
    }
    last FILE if $n == 0;
}
$response = $conn->put("$BUCKET", "$KEY", $data, \%headers);

To add custom metadata, simply add to the S3Object:

S3Object->new("this is a test", { name => "attribute" })

By default, every object has private access control when written. This allows only the user that stored the object to read it back. You can change these settings. Also, note that each object can hold a maximum of 5 GB of data.

You are probably wondering if it is also possible to upload via a standard HTTP POST. The folks at Amazon are working on it as we speak -- see HTTP POST beta discussion for more information. Until that's finished, you'll have to perform web-based uploads via an intermediate server.

Reading an Object

Like writing objects, there are several ways to read data from Amazon S3. One way is to generate a temporary URL to use with your favorite client (for example, wget or Curl) or even a browser to view or retrieve the object. All you have to do is generate the URL used to make the REST call.

my $generator = S3::QueryStringAuthGenerator->new($AWS_ACCESS_KEY_ID,
    $AWS_SECRET_ACCESS_KEY);

...and then perform a simple HTTP GET request. This is a great trick if all you want to do is temporarily view or verify the data.

$generator->expires_in(60);
my $url = $generator->get($BUCKET, "$KEY");
print "$url \n";

You can also programmatically read the data directly from the initial connection. This is handy if you have to perform additional processing of the data.

my $response = $conn->get("$BUCKET", "$KEY");
my $data     = $response->object->data;

Another cool feature is the ability to use BitTorrent to download files from Amazon S3 . You can access any object that has anonymous access privileges via BitTorrent.

Delete an Object

By now you probably have the hang of the process. If you're going to create objects, you're probably going to have to delete them at some point.

$conn->delete("$BUCKET", "$KEY");

Set Access Permissions and Publish to a Website

As you may have noticed from the previous examples, all Amazon S3 objects access goes through HTTP. This makes Amazon S3 particularly useful as a online repository. In particular, it's useful to manage and serve website media. You could almost imagine Amazon S3 serving as mini Content Delivery Network for media on your website. This example will demonstrate how to build a very simple online page where the images are served dynamically via Amazon S3.

The first thing to do us to upload some images and set the ACL permissions to public. I've modified the previous example with one difference. To make objects publicly readable, include the header x-amz-acl: public-read with the HTTP PUT request.

my %headers = (
    'x-amz-acl' => 'public-read',
);

Additional ACL permissions include:

  • private (default setting if left blank)
  • public-read
  • public-read-write
  • authenticated-read

Now you know enough to put together a small script that will automatically display all images in the bucket to a web page (you'll probably want to spruce up the formatting).

...
my $BUCKET   = "foobar";
my $response = $conn->list_bucket("$BUCKET");

for my $entry (@{$response->entries}) {
    my $public_url   = $generator->get($BUCKET, $entry->{Key});
    my ($url, undef) = split (/\?/, $public_url);
    $images         .= "<img src=\"$url\"><br />";
}
($webpage =  <<"WEBPAGE");
<html><body>$images</body></html>
WEBPAGE
print $q->header();
print $webpage;

To add images to this web page, upload more files into the bucket and they will automatically appear the next time you load the page.

It's also simple to link to media one at a time for a webpage. If you examine the HTML generated by this example, you'll see that all Amazon S3 URLs have the basic form http://bucketname.s3.amazon.com/objectname. Also note that the namespace for buckets is shared with all Amazon S3 users. You may have already picked up on this.

Conclusion

Amazon S3 is a great tool that can help with the data management needs of all sized organizations by offering cheap and unlimited storage. For personal use, it's a great tool for backups (also good for organizations) and general file storage. It's also a great tool for collaboration. Instead of emailing files around, just upload a file and set the proper access controls -- no more dealing with 10 MB attachment restrictions!

At SundayMorningRides.com we use S3 as part of our web serving infrastructure to reduce the load on our hardware when serving media content.

When combined with other Amazon Web Services such as SimpleDB (for structured data queries) and Elastic Compute Cloud (for data processing) it's easy to envision a low cost solution for web-scale computing and data management.

More Resources and References

Reverse Callback Templating

Programmers have long recognized that separating code logic from presentation is good. The Perl community has produced many fine systems for doing just this. While there are many systems, they largely fall within two execution models, pipeline and callback (as noted by Perrin Harkins in Choosing a Templating System). HTML::Template and Template Toolkit are in the pipeline category. Their templates consist of simple presentation logic in the form of loops and conditionals and template variables. The Perl program does its work, then loads and renders the appropriate template, as if data were flowing through a pipeline. Mason and Embperl fall into the callback category. They mix code in with the template markup, and the template "calls back" to Perl when it encounters program logic.

A third execution model exists: the reverse callback model. Template and code files are separate, just like in the pipeline approach. Instead of using a mini-language to handle display logic, however, the template consists of named sections. Perl executes and calls a specific section of the template at the appropriate time, rendering it. Effectively, this is the opposite of the callback method, which wraps Perl logic around portions (or sections) of a template in a single file. Reverse callback uses Perl statements to load, or call, specific portions of the the template. This approach has a few distinct advantages.

A Reverse Callback Example

Suppose that you have a simple data structure you are dying to output as pretty HTML.

my @goods = (
    "oxfords,Brown leather,\$85,0",
    "hiking,All sizes,\$55,7",
    "tennis shoes,Women's sizes,\$35,15",
    "flip flops,Colors of the rainbow,\$7,90"
    );

First, you need an HTML template with the appropriate sections defined. Sections are of vital importance; they enable Template::Recall to keep the logic squarely in the code. Template::Recall uses the default pattern /[\s*=+\s*\w+\s*=+\s*]/ (to match, for example, [==== section_name ====]) to determine sections in a single file. The start of one section denotes the end of another. This is because Template::Recall uses a split() operation based on the above regex, saving the \w+ as the section key in an internal data structure.

[ =================== header ===================]

<html>
<head>
    <title>my site - [' title ']</title>
</head>
<body>

<h4>The date is [' date ']</h4>



<table border="1">

    <tr>
        <th>Shoe</th>
        <th>Details</th>
        <th>Price</th>
        <th>Quantity</th>
    </tr>

[ =================== product_row =================== ]
    <tr>
        <td>[' shoe ']</td>
        <td>[' details ']</td>
        <td>[' price ']</td>
        <td>[' quantity ']</td>
    </tr>


[= footer =]
</table>

</body>
</html>

This template is quite simple. It has three sections, a "header," "product_row," and "footer." The sections essentially give away how the program logic is going to work. A driver program would call header and footer only once during program execution (start and end, respectively). product_row will be called multiple times during iteration over an array.

Names contained within the delimeters [' and '] are template variables for replacement during rendering. For example, [' date '] will be replaced by the current date when the program executes.

The driver code must first instantiate a new Template::Recall object, $tr, and pass it the path of the template, which I've saved as the file template1.html.

use Template::Recall;

my $tr = Template::Recall->new( template_path => 'template1.html');

With $tr created, the template sections are loaded and ready for use. The obvious first step is to render the header section with the render() method. render() takes the name of the section to process, and optionally, a hash of names and values to replace in that section. There are two template variables in the header section, [' title '] and [' date '], so the call looks like:

print $tr->render( 'header', { title => 'MyStore', date => scalar(localtime) } );

The names used in the hash must match the names of the template variables in the section you intend to render. For example, date => scalar(localtime) means that [' date '] in the header section will be dynamically replaced by the value produced by scalar(localtime).

You probably noticed from the template that the header section created the start of an HTML table. This is a fine time to render @goods as the table's rows.

for my $good (@goods)
{
    my @attr     = split(/,/, $good);
    my $quantity = $attr[3] eq '0' ? 'Out of stock' : $attr[3];

    my %row      = (
        shoe     => $attr[0],
        details  => $attr[1],
        price    => $attr[2],
        quantity => $quantity,
    );

    print $tr->render('product_row', \%row);
}

In actual code, this array would likely come from a database. For each row, the driver makes necessary logical decisions (such as displaying "Out of stock" if the quantity equals "0"), then calls $tr->render() to replace the placeholders in the template section with the values from %row.

Finally, the driver renders the footer of the HTML output. There are no template variables to replace, so there's no need to pass in a hash.

print $tr->render('footer');

The result is this nice little output of footwear inventory:

The date is Fri Aug 10 14:22:30 2007

Shoe Details Price Quantity
oxfords Brown leather $85 Out of stock
hiking All sizes $55 7
tennis shoes Women's sizes $35 15
flip flops Colors of the rainbow $7 90

The Logic Is in the Code

What happens if you extend your shoe data slightly, to add categories? For instance, what if @goods looks like:

my @goods = (
    "dress,oxfords,Brown leather,\$85,0",
    "sports,hiking,All sizes,\$55,7",
    "sports,tennis shoes,Women's sizes,\$35,15",
    "recreation,flip flops,Colors of the rainbow,\$7,90"
    );

The output now needs grouping, which implies the use of nested loops. One loop can output the category header -- sports, dress, or recreation shoes -- and another will output the details of each shoe in that category.

To handle this in HTML::Template, you would generally build a nested data structure of anonymous arrays and hashes, and then process it against nested <TMPL_LOOP> directives in the template. Template::Recall logic remains in the code, you would build a nested loop structure in Perl that calls the appropriate sections. You can also use a hash to render the category sections as keys and detail sections as values in a single pass, and output them together using join.

The template needs some modification:

[====== table_start ====]
<table border="1">
[====== category =======]
<tr><td colspan="4"><b>['category']</b></td></tr>
[====== detail ======]
<tr><td>['shoe']</td><td>['detail']</td><td>['price']</td><td>['quantity']</td></tr>
[======= table_end ====]
</table>

This template now has a section called "category," a single table row that spans all columns. The "detail" section is pretty much the same as in the previous.

my %inventory;

for my $good (@goods) {
    my @attr = split(/,/, $good);
    my $q    = $attr[4] == 0 ? 'Out of stock' : $attr[4];

    $inventory{ $tr->render('category', { category => $attr[0] } ) } .=
        $tr->render('detail',
            {
                shoe     => $attr[1],
                detail   => $attr[2],
                price    => $attr[3],
                quantity => $q,
            } );
}

print $tr->render('table_start') .
    join('', %inventory) .
    $tr->render('table_end');

This loop looks surprisingly similar to the first example, doesn't it? That's because it is. Instead of printing each row, however, this code renders the first column in @goods against the category template section, and then storing the output as a key in %inventory. In the same iteration, it renders the remaining columns against the detail section and appends to the value of that key.

After storing the rendered sections in this way to %inventory, the code prints everything with a single statement, using join to print all the values in %inventory, including keys. The output is:

recreation
flip flops Colors of the rainbow $7 90
sports
hiking All sizes $55 7
tennis shoes Women's sizes $35 15
dress
oxfords Brown leather $85 Out of stock

The code also handles conditional output. Suppose that at your growing online shoe emporium you provide special deals to customers who have bought over a certain dollar amount. As they browse your shoe inventory, these deals appear.

if ( $customer->is_elite ) {
    print $tr->render('special_deals', get_deals('elite') );
}
else {
    print $tr->render('standard_deals', get_deals() );
}

What about producing XML output? This usually requires a separate template? You can conditionally load a .xml or .html template:

my $tr;
if ( $q->param('fmt') eq 'xml' ) {
    $tr = Template::Recall->new( template_path => 'inventory.xml' );
}
else {
    $tr = Template::Recall->new( template_path => 'inventory.html' );
}

Perl provides everything you need to handle model, controller, and view logic. Template::Recall capitalizes on this and helps to make projects code driven.

Template Model Comparison

It's important to note a few things that occurred in these examples -- or failed to occur, rather. First, there's no mixture of code and template markup. All template access occurs through the method call $tr->render(). This is strong separation of concerns (SOC), just like the pipeline model, and unlike the callback model, which mixes template markup and code in the same file. Not only does strong SOC provide good code organization, it also keeps designers from having to sift through code to change markup. Consider using Mason to output the rows of @goods.

% for my $good (@goods) {
%  my @attr     = split(/,/, $good);
%  my $quantity = $attr[3] eq '0' ? 'Out of stock' : $attr[3];
<tr>
<td><% $attr[0] %></td>
<td><% $attr[1] %></td>
<td><% $attr[2] %></td>
<td><% $quantity %></td>
</tr>
% }

This is an efficient approach, and easy enough for a programmer to walk through. It becomes difficult to maintain though, when designers are involved, if for no other reason than because a designer and a programmer need to access the same file to do their respective work. Design changes and code changes will not always share the same schedule because they belong to different domains. It also means that in order to switch templates, say to output XML or text (or both), you have to add more and more conditionals and templates to the code, making it increasingly difficult to read.

The other thing that did not occur in this example is the leaking of any kind of logic (presentation or otherwise) into the template. Consider that HTML::Template would have to insert the <TMPL_LOOP> statement in the template in order to output the rows of @goods.

    <TMPL_LOOP NAME="PRODUCT">
    <tr>
    <td><TMPL_VAR NAME=SHOE></td>
    <td><TMPL_VAR NAME=DETAILS></td>
    <td><TMPL_VAR NAME=PRICE></td>
    <td><TMPL_VAR NAME=QUANTITY></td>
    </tr>
    </TMPL_LOOP>

That's not a big deal, really. If you care about line count, this only requires one extra line over the Template::Recall version, and that's the the closing tag </TMPL_LOOP>. Nonetheless, the template now states some of the logic for the application. Sure, it's only presentation logic, but it's logic nonetheless. HTML::Template also provides <TMPL_IF> for displaying items conditionally, and <TMPL_INCLUDE> for including other templates. Again, this is logic contained in the template files.

Template::Recall keeps as much logic as possible in the code. If you need to display something conditionally, use Perl's if statement. If you need to include other templates, load them using a Template::Recall object. Whereas the pipeline models likely work better for projects with a fairly sophisticated design team, Template::Recall tries to be the programmer's friend and let him or her steer from the most comfortable place, the code.

There is also a subtle cost to using the pipeline model for a simple loop like that above. Consider this HTML::Template footwear data code:

my $template = HTML::Template->new(filename => template1.tmpl');

my @output;

for my $good (@goods)
{
    my @attr = split(/,/, $_);
    my %row  = (
        SHOE     => $attr[0],
        DETAILS  => $attr[1],
        PRICE    => $attr[2],
        QUANTITY => $attr[3],
    );
    push( @output, \%row );
}

$template->param(PRODUCT => \@output);

print $template->output();

The code iterates over @goods and builds a second array, @output, with the rows as hash references. Then the template iterates over @output within <TMPL_LOOP>. That's walking over the same data twice. Template sections do not suffer this cost, because you can output the data immediately, as you get it:

print $tr->render('product_row', \%row);

This is essentially what happens with Mason (or JSP/PHP/ASP for that matter). The main difference is that Template::Recall renders the section through a method call rather than mixing code and template.

Template::Recall, by using sectioned templates, combines the efficiency of the callback model with the strong, clean separation of concerns inherent in the pipeline model, and perhaps gets the best of both worlds.

Advanced HTML::Template: Widgets

My previous article, looked at extending HTML::Template through custom tags and filters. This article looks at ways to manage large, more complex pages, by bundling HTML::Template into something like GUI "widgets" (or "controls").

Imagine you have a basic page layout following the standard setup, with a header, a lefthand navbar, and the main body in the bottom right. The header and navbar are the same for all pages of the site, but of course the main body differs from page to page:

Header
Navbar

Body

Naturally, you don't want to repeat the information for the header and the navbar explicitly on each page. Furthermore, if the HTML for the navbar changes, you don't want to have to modify each and every page. The <TMPL_INCLUDE> tag can help in this situation.

Create separate files for header and navbar, then include them in the template for each page (by convention, I use the filename extension .tpf for page fragments, to distinguish them from full-page templates: .tpl):

<html>
  <head></head>
  <body>
    <table>
    <tr colspan="2">
      <td><TMPL_INCLUDE NAME=header.tpf></td>
    </tr>
    <tr>
      <td><TMPL_INCLUDE NAME=navbar.tpf></td>
      <td> 
        <!-- Body goes here! -->
        ...
      </td>
    </tr>
    </table>
  </body>
</html>

Now HTML::Template will include the page fragments for the header and navbar in the page when it evaluates the template. Changes to either of the fragments will affect the entire site immediately.

(For simplicity of presentation, I am going to use the old familiar <table> method to fix the layout--this article is about HTML::Template, not CSS positioning!)

Note that both the header and the navbar may include other HTML::Template tags, such as <TMPL_VAR> or <TMPL_LOOP>: file fragment inclusion occurs before tag substitution. If you need dynamic content in either header or navbar, all you need to do is set the value of the corresponding parameter using the param() function before evaluating the template.

Better Encapsulation through Widgets

If there are only a few dynamic parameters in header and navbar, you can simply assign values to them together with the parameters required by the main body of the page. However, if the header and navbar themselves become sufficiently complicated, you probably don't want to repeat their parameter-setting logic with the actual Perl code managing the main business logic for each page of our site. Instead, you can control them through an API.

To establish a Perl API for a page fragment, hide the entire template handling, including parameter setting and template substitution, in a subroutine. The subroutine takes several parameters and returns a string containing the fully expanded HTML code corresponding to the page fragment. You can then include this string in the current page through a simple <TMPL_VAR> tag.

As an example, consider a navbar that contains the username of the currently logged-in user.

Here is the page-fragment template for the navbar:

Current User: <TMPL_VAR NAME=login>
<br />
<ul>
  <li><a href="page1.html">Page 1</a>
  <li><a href="page2.html">Page 2</a>
  <li><a href="page3.html">Page 3</a>
</ul>

For this example, the corresponding subroutine is very simple. It's easy to imagine a situation where the navbar requires some complex logic that you are glad to hide behind a function call--for instance, when the selection of links depends on the permissions (found through a DB call) of the logged-in user.

Demonstrating the principle is straightforward; find the template fragment, set the required parameter, and render the template:

sub navbar { 
  my ( $login ) = shift;

  my $tpl = HTML::Template->new( filename => 'navbar.tpf' );
  $tpl->param( login => $login );
  return $tpl->output();
}

The master-page template then includes the navbar string using a <TMPL_VAR> tag. (Note the header inclusion through a <TMPL_INCLUDE> tag.)

<html>
  <head></head>
  <body>
    <table>
    <tr colspan="2">
      <td><TMPL_INCLUDE NAME=header.tpf></td>
    </tr>
    <tr>
      <td><TMPL_VAR NAME=navbar></td>
      <td> 
        <!-- Body goes here! -->
        ...
      </td>
    </tr>
    </table>
  </body>
</html>

This approach provides pretty good encapsulation: the code calling the navbar() routine does not need to know anything about its implementation--in fact, the subroutine can be in a separate module entirely. It is not far-fetched to imagine a shared module for all reusable page fragments used on the site.

Building Pages Inside Out

This development model still uses a separate, top-level template file for each page. All shared parts of the page are then included in this master template.

The widget approach can go a step further to do away entirely with the notion of having a separate master template for each page, by turning even the main body of the page into a widget or a collection of widgets. At this point, there may be only a single top-level template:

<html>
  <head></head>
  <body>
    <table>
    <tr colspan="2">
      <td><TMPL_INCLUDE NAME=header.tpf></td>
    </tr>
    <tr>
      <td><TMPL_VAR NAME=navbar></td>
      <td><TMPL_VAR NAME=mainbody></td>
    </tr>
    </table>
  </body>
</html>

A central Perl "controller" component dispatches page requests to the appropriate main-body widget (assuming you specify destination pages through a request parameter called action):

use CGI;
use HTML::Template;

my $q = CGI->new;
my $action = $q->param( 'action' );

# Dispatch to the desired main function
my $body_string = '';
if(    $action eq 'act1' ) { $body_string = act1( $q ); }
elsif( $action eq 'act2' ) { $body_string = act2( $q ); }
elsif( $action eq 'act3' ) { $body_string = act3( $q ); }

# Pull the current user from the query object and pass to the navbar
my $navbar_string = navbar( $q->param( 'login' ) );

# Set the rendered navbar and mainbody in the master template
my $tpl = HTML::Template->new( filename => 'tmpl3.tpl' );
$tpl->param( mainbody => $body_string );
$tpl->param( navbar   => $navbar_string );

print $q->header(), $tpl->output;

sub navbar { ... }

sub act1{ ... }
sub act2{ ... }
sub act3{ ... }

A Drop-down Widget

At this point, you may ask why you still need a template for the page fragment at all. Well, you don't--unless you find it convenient, of course.

There are two reasons to use a template: as a more suitable method of generating HTML than having to program a whole bunch of print statements, and to ensure separation of presentation from behavior. By encapsulating the nitty-gritty of HTML generation behind an API, you achieve the latter. How you go about the former depends entirely on the context. If you need to generate a lot of straight up HTML, with lots of <table>, <img>, and <form> tags, a template fragment makes perfect sense. But if it is actually easier to program the print statements yourself, there is nothing wrong with that--by encapsulating the HTML generation in a subroutine, you still achieve separation of presentation and main control flow and business logic.

As a classic example for something that is hard to express as a template, consider a drop-down menu with a default that has to be set programmatically. Attempting to do this using a template leads to a mess of template loops and conditionals. However, doing it in a widget subroutine is clean and easy, in particular if you use the appropriate functions from the standard CGI module:

sub color_select {
  my ( $default_color ) = @_;

  my @colors = qw( red green blue yellow cyan magenta );

  return popup_menu( 'color_select', \@colors, $default_color );
}

Include the string returned by this subroutine in a page template using <TMPL_VAR> tags as discussed previously.

Conclusion

This concludes a brief overview of some useful techniques for using HTML::Template that go beyond straight up variable replacement. I hope you enjoyed the trip.

There remains the question of when all of this is useful and suitable. To me, the beauty of HTML::Template is its utter simplicity. There isn't much in the way of creature comforts or "framework" features (such as forms processing or automated request dispatch). On the other hand, there is virtually no overhead: it's possible to understand the basic ideas of HTML::Template in five minutes or fewer, and it's easy to add to any simple CGI script. You don't need to design the project around the framework (as is often the case with more powerful but inevitably more complex toolsets). In fact, HTML::Template is so trivial to use that I use it in any CGI script that produces more than, say, 10 to 15 lines of HTML output. It's just convenient.

Filters and "widgets" as described in this series are easy ways to add some convenience features that are missing from HTML::Template. By bundling some repetitive code segments into a custom tag or a widget, you can keep both the code and template cleaner and simpler while at the same time continuing to enjoy the low overhead of HTML::Template.

However, there is only so much lipstick you can put on a pig. When you find ourselves building extensive libraries of custom tags or specific widgets, maybe you want to look more deeply into one of the existing frameworks for Perl web development, such as the Template Toolkit.

Of course, as long as you are happy with HTML::Template and it works for you, there is no reason to change. It works for me--and very well indeed.

Sidebar: Three Hidden Gems

HTML::Template has several useful and often overlooked minor features and options, (despite being clearly documented in POD). I want to point out three of the ones that are most commonly useful--all of which, by default, are (unfortunately, I think) turned off.

Permit Unused Parameters
my $tpl = HTML::Template->new( filename => '...', die_on_bad_params => 0 );

HTML::Template will die when it encounters an unused parameter in the template. In other words, if you set a parameter with $tpl->param(), but there is no corresponding <TMPL_VAR> in the template, template processing will fail by default. Setting the option die_on_bad_params to 0 disables this behavior.

Make Global Variables Visible In Loops
my $tpl = HTML::Template->new( filename => '...', global_vars => 1 );

By default, template loops open up a new scope, which makes all template parameters from outside the loop invisible within the loop. In particular, code like this will (by default) not work as expected:

Verbosity level: <TMPL_VAR NAME=isVerbose>
<ul>
<TMPL_LOOP NAME=rows>
  <li><TMPL_VAR NAME=rowitem>
    <TMPL_IF NAME=isVerbose>
      ... <!-- print additional, 'verbose' info -->
    </TMPL_IF>
</TMPL_LOOP>
</ul>

The verbose parameter, defined outside the loop scope will by default not be visible within the loop. Change this by setting the option global_vars to 1.

Special Loop Variables
my $tpl = HTML::Template->new( filename => '...', loop_context_vars => 1 );

Finally, enable a very useful little feature by setting loop_context_vars to 1. This defines several Boolean variables within each loop; they take on the appropriate value for each row:

  • __first__
  • __inner__
  • __last__
  • __odd__
  • __even__

There is also an integer variable __counter__, which is incremented for each row. Note that __counter__ starts at 1, in contrast to Perl arrays!

These variables are extremely useful in a variety of ways. For example, they make it easy to give every other row in a table a different background color to improve legibility. Together with filters (as described in my previous article), this allows for rather elegant template code:

<html>
<head>
  <style type="text/css">
    .odd  { background-color: yellow }
    .even { background-color: cyan }
  </style>
</head>

<body>
  <table>
  <TMPL_LOOP NAME=rows>
    <CSTM_ROW EVEN=even ODD=odd>
      <td> <TMPL_VAR NAME=__counter__> Cell contents... </td>
    </tr>
  </TMPL_LOOP>
  </table>
</body>
</html>

The appropriate filter for the new custom tag is:

sub cstmrow_filter {
  my $text_ref = shift;
  $$text_ref =~ s/<CSTM_ROW\s+EVEN=(.+)\s+ODD=(.*)\s*>
                 /<TMPL_IF NAME=__odd__>
                    <tr class="$1">
                  <TMPL_ELSE>
                    <tr class="$2">
                  <\/TMPL_IF>
                 /gx;
}

Note that, as implemented, the EVEN attribute must precede the ODD attribute in the <CSTM_ROW> tag.

Visit the home of the Perl programming language: Perl.org

Sponsored by

Powered by Movable Type 5.02