Introducing mod_parrot
by Jeff HorwitzDecember 22, 2004
It's been almost nine years since the first release of mod_perl, and it remains a very powerful tool for writing web applications and extending the capabilities of the Apache web server. However, lurking around the corner is Perl 6, which gives us not only a new version of Perl to embed in Apache but an entirely new runtime engine called Parrot. If there is ever going to be a Perl 6 version of mod_perl, Apache must first be able to run Parrot bytecode. This article introduces mod_parrot, an Apache module that allows the execution of Parrot bytecode from within the web server. Like mod_perl, it also gives your code direct access to the Apache API so you can write your own handlers.
What is Parrot?
Parrot is a virtual machine (VM) optimized for dynamic languages like Perl, Python, PHP, and Ruby. Source code written in each of these languages eventually compiles down to bytecode (after some optimizations), which subsequently runs in a virtual machine. Currently, each language runs bytecode with its own VM, but one of Parrot's goals is to provide a single common VM for all dynamic languages. This makes implementing a new language much easier because there's no need to worry about writing a new VM, and this also makes it possible for code in one language to call code or access data structures from another language.
|
Related Reading
Perl 6 and Parrot Essentials |
Parrot code comes in three distinct flavors:
- Bytecode: This is the file format natively interpreted by Parrot.
- PASM: Parrot assembler (PASM) is the low-level language that compiles down to bytecode. It has very simple operations to perform functions such as setting registers, adding numbers, and printing strings. PASM is very straightforward, but it operates at such a low level that it can be quite cumbersome.
- PIR: Parrot Intermediate Representation (PIR) solves many of the problems encountered when programming in PASM. It provides more user-friendly and compiler-friendly constructs and optimizations and feels more like a traditional high-level programming language. Parrot eventually breaks down PIR into PASM before compiling to bytecode (you can even include PASM blocks in PIR). All of the examples in this article use PIR.
For more information on Parrot, including PASM and PIR syntax, visit the Parrot website. It will provide a good background for understanding the code in this article.
Why mod_parrot?
Before discussing the details, you should know a little about mod_parrot's history. Ask Björn Hansen and Robert Spier originally wrote mod_parrot in 2002, later turning it over to Kevin Falcone. This version of mod_parrot targeted Apache 1.3 and had very limited functionality due to Parrot's immaturity at the time. In August 2004, with Parrot and its API much more mature, people suggested that the development on mod_parrot continue. This is where I picked up the project. However, instead of picking up where Ask, Robert, and Kevin left off, I started from scratch, coding for Apache 2 and focusing on access to the Apache API.
The new mod_parrot project has three primary goals:
- Provide access to the Apache API through Parrot objects
- Provide a common Apache layer for Parrot-based languages
- Support for new languages should require little or no C coding .
Let's discuss each of these in more detail.
Provide Access to the Apache API Through Parrot
Much of mod_perl's power comes from direct access to the Apache API. Rather than restrict your code to content generation, mod_perl provides hooks for things such as authentication handlers and output filters and gives you access to Apache's internal structures, all in Perl. Once you have this functionality, it is easy to implement other useful features including script caching and persistent database connections.
mod_parrot shares this approach, providing access to the Apache API from
Parrot. It does this using Parrot objects, mimicking mod_perl's use of
$r. There will eventually be hooks for all phases of the Apache
lifecycle, though the current version supports only content handlers and
authentication handlers.
Provide a Common Apache Layer for Parrot-based Languages
There are several different languages that can run inside Apache today. The major players here are mod_perl and PHP, but Python, Ruby, and even LISP have modules embedding them into Apache. Each of these implementations comes with its own Apache module, which makes sense for languages with different runtime engines. This is where Parrot changes the landscape dramatically-all languages targeted to the Parrot VM now have a common runtime engine, so they need only one Apache module: mod_parrot.
Support for New Languages Should Require Little or No C Coding
mod_parrot will provide all of the infrastructure for accessing the Apache API. The actual Apache module will already be written. Hooks for calling Parrot code for each stage of the Apache lifecycle will exist. Parrot objects will provide access to the Apache API. With all of this already done, and assuming our language compiles down to Parrot bytecode, we should be able to write the "glue" between Apache and our language in the language itself. mod_perl could be written in Perl; mod_python could be written in Python, and so on. Very little C code, if any, would be necessary. Each language community could maintain its own language-specific code while sharing the mod_parrot module.
Architecture
mod_parrot is written for Apache 2, with no plans to back-port it to Apache 1.3. The reason behind this decision is to code for the future, not the past or present; after all, Perl 6 is still a few years down the road. It's also much easier to write a module for Apache 2 than it is for 1.3! In addition to the Apache 2 decision, there are several other interesting aspects of the mod_parrot architecture.
NCI
The most significant design decision is the use of NCI (native call
interface) to access the Apache API. mod_perl accesses most of the Apache API
functions through individual XS wrappers (basically a bunch of C macros),
themselves compiled into mod_perl itself or its supporting modules. This is a
tried and true method, used for many Perl modules as well. Now, Parrot gives us
NCI, which eliminates the need for these wrappers, letting you call arbitrary C
functions without having to write any C code. Here's an example of a Parrot
program that calls the C function getpid(), which returns the
current process ID:
.sub _main
# load libc.so, where getpid() is defined, and assign it to $P0
$P0 = loadlib '/lib/libc.so.6'
# find the function in the library and assign it to $P1
# 'iv' means that getpid() returns an integer and takes no arguments
$P1 = dlfunc $P0, 'getpid', 'iv'
# call getpid() and place result in $I0
$I0 = $P1( )
# print the PID
print $I0
print "\n"
.end
That's it--there is no C code to write, no recompilation, and no relinking.
However, the Apache API functions do not come from a loadable shared library;
they're in the Apache executable, httpd. Fortunately, NCI can run
C functions contained in the running process image, solving that problem. For
more information on NCI, see the Parrot NCI
Documentation.
The Apache::RequestRec Object
All access to the Apache API goes through Parrot objects. Because
mod_parrot borrows heavily from mod_perl, it made sense to base the primary
object class in mod_parrot on Apache's request_rec structure. Just
as in mod_perl, the class is Apache::RequestRec. This name is
subject to change, however, as Parrot's namespace nomenclature becomes
clearer.
Every method is written in Parrot, with NCI calls to their corresponding
Apache API functions. For example, here is the Parrot method for the
ap_rputs function ($r->puts in mod_perl):
.sub puts method, prototyped
.param string data
.local pmc r
.local pmc ap_rputs
.local int offset
classoffset offset, self, 'Apache::RequestRec'
getattribute r, self, offset
# find NCI object for ap_rputs
find_global ap_rputs, 'Apache::NCI', 'ap_rputs'
# use NCI to call out to Apache's ap_rputs
ap_rputs( data, r )
.end
Currently, Apache::RequestRec is the only class implemented in
mod_parrot. Other classes to support the API will eventually appear, including
classes to support Apache's conn_rec and server_rec
structures.
Pages: 1, 2 |

