Programming with Mason

Dec 11, 2002 by Dave Rolsky

Dave Rolsky and Ken Williams are the authors of Embedding Perl in HTML with Mason.

Mason is a powerful framework for generating dynamic text, and is especially useful when creating complex, featureful Web sites. For those (hopefully few) folks who haven’t yet heard of Mason, it is a Perl-based templating framework comparable to frameworks such as Apache::ASP, Embperl, and Template Toolkit. Like the first two, and unlike the latter, Mason operates by embedding Perl in text.

Mason is based around the idea of a component. A component is roughly equivalent to a Perl subroutine, and can contain text and/or code. Here is a very simple, but complete component that has both text and code:

 % my $planet = "World";
 Hello, <% $planet %>!

When Mason runs this code, the output is:

 Hello, World!

The rest of this article assumes at least a minimal familiarity with Mason, though if you’re at all familiar with other templating systems, you’ll probably be able to grok the code we show. For more details, I would of course recommend Embedding Perl in HTML with Mason, written by Ken Williams and myself. Mason also comes with its own documentation, which can be seen online at www.masonhq.com.

As with any powerful and flexible system, Mason is applicable to a lot of problems, and there is always more than one way to do it. It is a Perl-based system, after all!

Below you’ll find some cookbook recipes for solving a few typical Web application problems. All the recipes assume that you are using the latest version of Mason, which at the time of this writing is 1.15, though most of them will work untouched with older versions.

Putting a Session ID in All URLs

If you’ve ever written a dynamic Web application, then it’s likely that you’ve used sessions to store data as the user moves through the application. Typically, sessions are identified by session IDs that are stored in a cookie.

If you cannot use cookies, then you can store the session ID in the URL. There are security and application problems with this approach (as well as with the user of cookies), but those are outside the scope of this article. The mod_perl user list archives at marc.theaimsgroup.com/?l=apache-modperl contain a number of discussions related to this topic.

Putting the session ID in the URL can be a hassle, because it means that you have to somehow process all the URLs you generate. Using Mason, this isn’t as difficult as it would be otherwise. There are at least two ways to do this.

The first would be to put a filter in your top level autohandler component:

  <%filter>
   s/href="([^"])+"/'href="' . add_session_id($1) . '"'/eg;
   s/action="([^"])+"/'href="' . add_session_id($1) . '"'/eg;
  </%filter>

The add_session_id() subroutine, which should be defined in a module, might look something like this: sub add_session_id { my $url = shift;

      return $url if $url =~ m{^\w+://}; # Don't alter external URLs

      if ($url =~ /\?/) {
      $url =~ s/\?/?session_id=$MasonBook::Session{_session_id}&/;
      } else {
          $url .= "?session_id=$MasonBook::Session{_session_id}";
      }

      return $url;
  }

This routine accounts for external links as well as links with or without an existing query string.

The drawback to putting this in a <%filter> section is that it only filters URLs in the content generated by components, and misses any URLs that might be in headers, such as in a redirect. Therefore, you’d need to handle those cases separately with this solution.

Another solution would be to create all URLs (including those intended for redirects) via a dedicated component or subroutine that adds the session id. This latter solution is probably a better idea, as it handles redirects properly. The drawback with this strategy is that you’ll have a Mason component call for every link, instead of just regular HTML.

Here is just such a component:

  <%args>
   $scheme   => 'http'
   $username => undef
   $password => ''
   $host     => undef
   $port     => undef
   $path
   %query    => ()
   $fragment => undef
  </%args>
  <%init>
   my $uri = URI->new;

   if ($host) {
       $uri->scheme($scheme);

       if (defined $username) {
           $uri->authority( "$username:$password" );
       }

       $uri->host($host);
       $uri->port($port) if $port;
   }

   # Sometimes we may want to include a path in a query string as part
   # of the path but the URI module will escape the question mark.
   my $q;

   if ( $path =~ s/\?(.*)$// ) {
       $q = $1;
   }

   $uri->path($path);

   # If there was a query string, we integrate it into the query
   # parameter.
   if ($q) {
       %query = ( %query, split /[&=]/, $q );
   }

   $query{session_id} = $UserSession{session_id};

   # $uri->query_form doesn't handle hash ref values properly
   while ( my ( $key, $value ) = each %query ) {
       $query{$key} = ref $value eq 'HASH' ? [ %$value ] : $value;
   }

   $uri->query_form(%query) if %query;

   $uri->fragment($fragment) if $fragment;
  </%init>
  <% $uri->canonical | n %>\

If you didn’t want to put the session ID in the query string, then you might instead make it part of the URL path. The application could retrieve the session id from incoming requests by using a mod_perl handler during the URL translation stage of request handling.

This component provides a programmatic interface to URL generation. Here is an example of how to use it, assuming that you’ve saved it as a component called /url:

   ... some HTML ...
   Look at <a href="<& /url, path => "books.html" &>">our books</a>
   or <a href="<& /url, host => "www.oreilly.com"
                        path => "/catalog" &>">O'Reilly's</a>.
   ... some HTML ...

Making Use of Autoflush

Every once in a while, you may have to output a very large component or a file to the client. If you simply let this accumulate in the output buffer, you could use up a lot of memory. Furthermore, the slow response time may make the user think that the site has stalled.

Here is an example that sends out the contents of a potentially large file without sucking up lots of memory.

  <%args>
   $filename
  </%args>
  <%init>
   local *FILE;
   open FILE, "< $filename" or die "Cannot open $filename: $!";
   $m->autoflush(1);
   while (<FILE>) {
       $m->print($_);
   }
   $m->autoflush(0);
  </%init>

If each line wasn’t too huge, then you might just flush the buffer every once in a while:

  <%args>
   $filename
  </%args>
  <%init>
   local *FILE;
   open FILE, "< $filename" or die "Cannot open $filename: $!";
   while (<FILE>) {
       $m->print($_);
       $m->flush_buffer unless $. % 10;
   }
   $m->flush_buffer;
  </%init>

The unless $. % 10 bit makes use of the special Perl variable $., which is the current line number of the file being read. If this number modulo 10 is equal to zero, then we flush the buffer. This means that we flush the buffer every 10 lines. (Replace the number 10 with any desired value.)

User Authentication and Authorization

One problem that Web sites have to solve over and over again is user authentication and authorization. These two topics are related but not the same, as some might think. Authentication is the process of figuring out if someone is who they say they are, and usually involves checking passwords or keys. Authorization comes after this, when we want to determine whether a particular person is allowed to perform a certain action.

There are a number of modules on CPAN that are intended to help do these things under mod_perl. In fact, Apache has separate request-handling phases for both authentication and authorization that mod_perl can handle. It is certainly possible to use these modules with Mason.

You can also do authentication and authorization using Mason components. Authentication will usually involve some sort of request for a login and a password, after which you give the user some sort of token (either in a cookie or a session) that indicates that they have been authenticated. You can then check the validity of this token for each request.

If you have such a token, then authorization simply consists of checking that the user to whom the token belongs is allowed to perform a given action.

Using Apache::AuthCookie

The Apache::AuthCookie module, available from CPAN, is a module that handles both authentication and authorization via mod_perl and can be easily hooked into Mason. Rather than go through all the details of configuring Apache::AuthCookie, which requires various settings in your server config file, let’s just skip all that and show you how you’d make the interface to Mason.

Apache::AuthCookie requires that you create a “login script” that will be executed the first time a browser tries to access a protected area. Calling this a script is actually somewhat misleading since it is really a page rather than a script (though it could be a script that generates a page). Regardless, using a Mason component for your “login script” merely requires that you specify the path to your Mason component for the login script parameter.

We’ll call this “script” AuthCookieLoginForm.comp:

  <html>
  <head>
  <title>Mason Book AuthCookie Login Form</title>
  </head>
  <body>
  <p>
  Your attempt to access this document was denied
  (<% $r->prev->subprocess_env("AuthCookieReason") %>).  Please enter
  your username and password.
  </p>

  <form action="/AuthCookieLoginSubmit">
  <input type="hidden" name="destination" value="<% $r->prev->uri %>">
  <table align="left">
   <tr>
    <td align="right"><b>Username:</b></td>
    <td><input type="text" name="credential_0" size="10" maxlength="10"></td>
   </tr>
   <tr>
    <td align="right"><b>Password:</b></td>
    <td><input type="password" name="credential_1" size="8" maxlength="8"></td>
   </tr>
   <tr>
    <td colspan="2" align="center"><input type="submit" value="Continue"></td>
   </tr>
  </table>
  </form>

  </body>
  </html>

This component is modified version of the example login script included with the Apache::AuthCookie distribution.

The action used for this form, ``, is configured as part of your AuthCookie configuration in your httpd.conf file.

That’s about all it takes to glue Apache::AuthCookie and Mason together. The rest of authentication and authorization is handled by configuring mod_perl to use Apache::AuthCookie to protect anything on your site that needs authorization. A very simple configuration might include the following directives:

  PerlSetVar MasonBookLoginScript /AuthCookieLoginForm.comp

  <location /authcookieloginsubmit>
    AuthType MasonBook::AuthCookieHandler
    AuthName MasonBook
    SetHandler  perl-script
    PerlHandler MasonBook::AuthCookieHandler->login
  </location>

  <location /protected>
    AuthType MasonBook::AuthCookieHandler
    AuthName MasonBook
    PerlAuthenHandler MasonBook::AuthCookieHandler->authenticate
    PerlAuthzHandler  MasonBook::AuthCookieHandler->authorize
    require valid-user
  </location>

The MasonBook::AuthCookieHandler module would look like this: package MasonBook::AuthCookieHandler;

  use strict;

  use base qw(Apache::AuthCookie);

  use Digest::SHA1;

  my $secret = "You think I'd tell you?  Hah!";

  sub authen_cred {
      my $self = shift;
      my $r = shift;
      my ($username, $password) = @_;

      # implementing _is_valid_user() is out of the scope of this chapter
      if ( _is_valid_user($username, $password) ) {
          my $session_key =
            $username . '::' . Digest::SHA1::sha1_hex( $username, $secret );
          return $session_key;
      }
  }

  sub authen_ses_key {
      my $self = shift;
      my $r = shift;
      my $session_key = shift;

      my ($username, $mac) = split /::/, $session_key;

      if ( Digest::SHA1::sha1_hex( $username, $secret ) eq $mac ) {
          return $session_key;
      }
  }

This provides the minimal interface an Apache::AuthCookie subclass needs to provide to get authentication working.

Doing It My Way (Thanks Frank)

But what if you don’t want to use Apache::AuthCookie? For example, your site may need to work without using cookies. No doubt this was exactly what Frank Sinatra was thinking about when he sang “My Way,” so let’s do it our way.

First, we will show an example authentication system that only uses Mason and passes the authentication token around via the URL (actually, via a session).

This example assumes that we already have some sort of session system that passes the session id around as part of the URL, as discussed previously.

We start with a quick login form. We will call this component login_form.html:

  <%args>
   $username => ''
   $password => ''
   $redirect_to => ''
   @errors => ()
  </%args>
  <html>
  <head>
  <title>Mason Book Login</title>
  </head>

  <body>

  % if (@errors) {
  <h2>Errors</h2>
  %   foreach (@errors) {
  <b><% $_ | h %></b><br>
  %   }
  % }

  <form action="login_submit.html">
  <input type="hidden" name="redirect_to" value="<% $redirect_to %>">
  <table align="left">
   <tr>
    <td align="right"><b>Login:</b></td>
    <td><input type="text" name="username" value="<% $username %>"></td>
   </tr>
   <tr>
    <td align="right"><b>Password:</b></td>
    <td><input type="password" name="password" value="<% $password %>"></td>
   </tr>
   <tr>
    <td colspan="2" align="center"><input type="submit" value="Login"></td>
   </tr>
  </table>
  </form>

  </body>
  </html>

This form uses some of the same techniques we show in Chapter 8 (“Building a Mason Site”) to pre-populate the form and to handle errors.

Now let’s make the component that handles the form submission. This component, called login_submit.html, will check the username and password and, if they are valid, place an authentication token into the user’s session:

  <%args>
   $username
   $password
   $redirect_to
  </%args>
  <%init>
   if (my @errors = check_login($username, $password) {
       $m->comp( 'redirect.mas',
                  path => 'login_form.html',
                  query => { errors => \@errors,
                             username => $username,
                             password => $password,
                             redirect_to => $redirect_to } );
   }
 
   $MasonBook::Session{username} = $username;
   $MasonBook::Session{token} =
       Digest::SHA1::sha1_hex( 'My secret phrase', $username );
 
   $m->comp( 'redirect.mas',
             path => $redirect_to );
  </%init>

This component simply checks (via magic hand waving) that the username and password are valid and if they are, it generates an authentication token, which is added to the user’s session. To generate this token, we take the username, which is also in the session, and combine it with a secret phrase. We then generate a MAC from those two things.

The authentication and authorization check looks like this:

  if ( $MasonBook::Session{token} ) {
      if ( $MasonBook::Session{token} eq
           Digest::SHA1::sha1_hex( 'My secret phrase',
                                   $MasonBook::Session{username} ) {

          # R<... valid login, do something here>
      } else {
          # R<... someone is trying to be sneaky!>
      }
  } else { # no token
       my $wanted_page = $r->uri;
       
       # Append query string if we have one.
       $wanted_page .= '?' . $r->args if $r->args;

       $m->comp( 'redirect.mas',
                  path => '/login/login_form.html',
                  query => { redirect_to => $wanted_page } );
  }

We could put all the pages that require authorization in a single directory tree and have a top-level autohandler in that tree do the check. If there is no token to check, then we redirect the browser to the login page, and after a successful login they’ll return, assuming that they submit valid login credentials.

Access Controls With Attributes

The components we saw previously assumed that there are only two access levels, unauthenticated and authenticated. A more complicated version of this code might involve checking that the user has a certain access level or role.

In that case, we’d first check that we had a valid authentication token and then go on to check that the user actually had the appropriate access rights. This is simply an extra step in the authorization process.

Using attributes, we can easily define access controls for different portions of our site. Let’s assume that we have four access levels, “Guest,” “User,” “Editor” and “Admin.” Most of the site is public, and viewable by anyone. Some parts of the site require a valid login, while some require a higher level of privilege.

We implement our access check in our top-level autohandler, ``, from which all other components must inherit in order for the access control code to be effective.

  <%init>
   my $user = get_user();  # again, hand waving
 
   my $required_access = $m->base_comp->attr('required_access');
 
   unless ( $user->has_access_level($required_access) ) {
      # R<... do something like send them to another page>
   }
 
   $m->call_next;
  </%init>
  <%attr>
   required_access => 'Guest'
  </%attr>

It is crucial that we set a default access level in this autohandler. By doing this, we are saying that by default, all components are accessible by all people, since every visitor will have at least “Guest” access.

We can override this default elsewhere. For example, in a component called /admin/autohandler, we might have:

  <%attr>
   required_access => 'Admin'
  </%attr>

As long as all the components in the directory inherit from the component and don’t override the required_access attribute, we have effectively limited that directory (and its subdirectories) to administration users only. If we, for some reason, had an individual component in the `` directory that we wanted editors to be able to see, we could simply set the “required_access” attribute for that component to “Editor.”

Managing DBI Connections

Not infrequently, we see people on the Mason users list asking questions about how to handle caching DBI connections.

Our recipe for this is really simple:

  use Apache::DBI

Rather than reinventing the wheel, use Apache::DBI, which provides the following features:

It is completely transparent to use. Once you’ve used it, you simply call DBI->connect() as always and Apache::DBI gives you an existing handle if one is available.
It makes sure that the handle is live, so that if your RDBMS goes down and then back up, your connections still work just fine.
It does not cache handles made before Apache forks, as many DBI drivers do not support using a handle after a fork.

Generating Config Files

Config files are a good candidate for generation by Mason. For example, your production and staging Web server config files might differ in only a few areas. Changes to one usually will need to be propagated to another. This is especially true with mod_perl, where Web server configuration can basically be part of a Web-based application.

On top of this, you may decide to set up a per-developer environment, either by having each developer run the necessary software on their own machine, or by starting Web servers on many different ports on a single development server. In this scenario, a template-driven config file generator becomes even more appealing.

Here’s a simple script to drive this generation. This script assumes that all the processes are running on one shared development machine.

  #!/usr/bin/perl -w

  use strict;

  use Cwd;
  use File::Spec;
  use HTML::Mason;
  use User::pwent;

  my $comp_root =
      File::Spec->rel2abs( File::Spec->catfile( cwd(), 'config' ) );

  my $output;
  my $interp =
      HTML::Mason::Interp->new( comp_root  => $comp_root,
                out_method => \$output,
                  );

  my $user = getpwuid($<);

  $interp->exec( '/httpd.conf.mas', user => $user );

  my $file =  File::Spec->catfile( $user->dir, 'etc', 'httpd.conf' );
  open FILE, ">$file" or die "Cannot open $file: $!";
  print FILE $output;
  close FILE;

A httpd.conf.mas component might look like this:

  ServerRoot <% $user->dir %>

  PidFile <% File::Spec->catfile( $user->dir, 'logs', 'httpd.pid' ) %>

  LockFile <% File::Spec->catfile( $user->dir, 'logs', 'httpd.lock' ) %>

  Port <% $user->uid + 5000 %>

  # loads Apache modules, defines content type handling, etc.
  <& standard_apache_config.mas &>

  <perl>
   use lib <% File::Spec->catfile( $user->dir, 'project', 'lib' ) %>;
  </perl>

  DocumentRoot <% File::Spec->catfile( $user->dir, 'project', 'htdocs' ) %>

  PerlSetVar MasonCompRoot <% File::Spec->catfile( $user->dir, 'project', 'htdocs' ) %>
  PerlSetVar MasonDataDir <% File::Spec->catfile( $user->dir, 'mason' ) %>

  PerlModule HTML::Mason::ApacheHandler

  <filesmatch "\.html$">
   SetHandler perl-script
   PerlHandler HTML::Mason::ApacheHandler
  </filesmatch>

  <%args>
  $user
  </%args>

This points the server’s document root to the developer’s working directory. Similarly, it adds the project/lib directory to Perl’s @INC via use lib so that the user’s working copy of the project’s modules are seen first. The server will listen on a port equal to the user’s user ID, plus 5,000.

Obviously, this is an incomplete example. It doesn’t specify where logs will go, or other necessary config items. It also doesn’t handle generating the config file for a server intended to be run by the root user on a standard port.

If You Want More …

These recipes were adapted from Chapter 11, “Recipes,” of Embedding Perl in HTML With Mason. And, of course, the book contains a lot more than just recipes. If you’re interested in learning more about Mason, the book is a great place to start.

Also, don’t forget to check out the Mason HQ site at www.masonhq.com/, which contains online documentation, user-contributed code and docs, and links to the Mason users mailing list, which is another great resource for developers using Mason.

O’Reilly & Associates recently released (October 2002) Embedding Perl in HTML with Mason.

Sample Chapter 5, Advanced Features, is available free online.
You can also look at the Table of Contents, the Index, and the Full Description of the book.
For more information, or to order the book, click here.