Sign In/My Account | View Cart  
advertisement


Listen Print

Parrot : Some Assembly Required

by Simon Cozens
September 18, 2001

Last week, the first public version of Parrot was released. This week, we're going to take a close look at what Parrot is, how you can get hold of it and play with it, and what we intend for Parrot in the future.

What Is Parrot?

First, though, what is Parrot, and why are we making such a fuss about it? Well, if you haven't been living in a box for the past year, you'll know that the Perl community has embarked on the design and implementation of a new version of Perl, both the language and the interpreter.

Parrot is strongly related to Perl 6, but it is not Perl 6. To find out what it actually is, we need to know a little about how Perl works. When you feed your program into perl, it is first compiled into an internal representation, or bytecode; then this bytecode is fed to almost separate subsystem inside perl to be interpreted. So there are two distinct phases of perl's operation: compilation to bytecode, and interpretation of bytecode. This is not unique to Perl; other languages following this design include Python, Ruby, Tcl and, believe it or not, even Java.

Advanced Perl Programming Advanced Perl Programming
Sriram Srinivasan
August 1997
1-56592-220-4, Order Number: 2204
427 pages, $34.95

In previous versions of Perl, this arrangement has been pretty ad hoc: There hasn't been any overarching design to the interpreter or the compiler, and the interpreter has ended up being pretty reliant on certain features of the compiler. Nevertheless, the interpreter (some languages call it a Virtual Machine) can be thought of as a software CPU - the compiler produces "machine code" instructions for the virtual machine, which it then executes, much like a C compiler produces machine code to be run on a real CPU.

Perl 6 plans to separate the design of the compiler and the interpreter. This is why we've come up with a subproject, which we've called Parrot that has a certain, limited amount of independence from Perl 6. Parrot is destined to be the Perl 6 Virtual Machine, the software CPU on which we will run Perl 6 bytecode. We're working on Parrot before we work on the Perl 6 compiler because it's much easier to write a compiler once you have a target to compile to!

The name "Parrot" was chosen after this year's April Fool's Joke, which had Perl and Python collaborating on the next version of their interpreters. This is meant to reflect the idea that we'd eventually like other languages to use Parrot as their VM; in a sense, we'd like Parrot to become a "common language runtime" for dynamic languages.

Where We're At

After the release last Monday, we've seen a huge amount of activity on the development list, with more than 100 CVS commits in the past week. However, it should be stressed we're still in the early stages of development.

But don't let that put you off! Parrot is still very much usable; we've already seen one mini-language emerge that compiles down to Parrot bytecode (more on that later) and Leon Brocard has been working on automatically converting Java bytecode to Parrot.

At the moment, it's possible to write simple programs in Parrot assembly language, use an assembler to convert them to machine code and then execute them on a test interpreter. We have support for a wide variety of ordinary and transcendental mathematical operations, some rudimentary string support and some conditional operators.

How to Get It

So let's get ourselves a copy of Parrot, so that we can start investigating how to program in the Parrot assembler.

We could get the initial release from CPAN, but an awful lot has changed since then. To really keep up to date with Parrot, we should get our copy from the CVS repository. Here's how we do that:

% cvs -d :pserver:anonymous@cvs.perl.org:/home/perlcvs login
(Logging in to anonymous@cvs.perl.org)
CVS password: [ and here we just press return ]
% cvs -d :pserver:anonymous@cvs.perl.org:/home/perlcvs co parrot
cvs server: Updating parrot
U parrot/.cvsignore
U parrot/Config_pm.in
....
    

For those of you who can't use CVS, there are CVS snapshots built every six hours that you can find here.

Now we have downloaded Parrot, we need to build it; so:

% cd parrot
% perl Configure.pl
Parrot Configure
Copyright (C) 2001 Yet Another Society

Since you're running this script, you obviously have
Perl 5 -- I'll be pulling some defaults from its configuration.
...
    
You'll then be asked a series of questions about your local configuration; you can almost always hit return for each one. Finally, you'll be told to type make test_prog; with any luck, Parrot will successfully build the test interpreter. (If it doesn't, the address to complain to is at the end of the article ...)

The Test Suite

Now we should run some tests; so type make test and you should see a readout like the following:

perl t/harness
t/op/basic.....ok, 1/2 skipped:  label constants unimplemented in
assembler
t/op/string....ok, 1/4 skipped:  I'm unable to write it!
All tests successful, 2 subtests skipped.
Files=2, Tests=6,  2 wallclock secs ( 1.19 cusr +  0.22 csys =  1.41 CPU)
    

(Of course, by the time you read this, there could be more tests, and some of those which skipped might not skip - but none of them should fail!)

Parrot Concepts

Before we dive into programming Parrot assembly, let's take a brief look at some of the concepts involved.

Types

The Parrot CPU has four basic data types:

IV
An integer type; guaranteed to be wide enough to hold a pointer.
NV
An architecture-independent floating-point type.
STRING
An abstracted, encoding-independent string type.
PMC
A scalar.

The first three types are pretty much self-explanatory; the final type, Parrot Magic Cookies, are slightly more difficult to understand. But that's OK, because they're not actually implemented yet! We'll talk more about PMCs at the end of the article.

Registers

The current Perl 5 virtual machine is a stack machine - it communicates values between operations by keeping them on a stack. Operations load values onto the stack, do whatever they need to do and put the result back onto the stack. This is easy to work with, but it's slow: To add two numbers together, you need to perform three stack pushes and two stack pops. Worse, the stack has to grow at runtime, and that means allocating memory just when you don't want to be allocating it.

So Parrot's going to break with the established tradition for virtual machines, and use a register architecture, more akin to the architecture of a real hardware CPU. This has another advantage: We can use all the existing literature on how to write compilers and optimizers for register-based CPUs for our software CPU!

Parrot has specialist registers for each type: 32 IV registers, 32 NV registers, 32 string registers and 32 PMC registers. In Parrot assembler, these are named I1...I32, N1...N32, S1...S32, P1...P32.

Now let's look at some assembler. We can set these registers with the set operator:

    set I1, 10
    set N1, 3.1415
    set S1, "Hello, Parrot"
    

All Parrot ops have the same format: the name of the operator, the destination register and then the operands.

Operations

There are a variety of operations you can perform: the file docs/parrot_assembler.pod documents them, along with a little more about the assembler syntax. For instance, we can print out the contents of a register or a constant:

    print "The contents of register I1 is: "
    print I1
    print "\n"
    

Or we can perform mathematical functions on registers:

    add I1, I1, I2  # Add the contents of I2 to the contents of I1
    mul I3, I2, I4  # Multiply I2 by I4 and store in I3
    inc I1          # Increment I1 by one
    dec N3, 1.5     # Decrement N3 by 1.5
    

We can even perform some simple string manipulation:

    set S1, "fish"
    set S2, "bone"
    concat S1, S2       # S1 is now "fishbone"
    set S3, "w"
    substr S4, S1, 1, 7
    concat S3, S4       # S3 is now "wishbone"
    length I1, S3       # I1 is now 8
    

Branches

Code gets a little boring without flow control; for starters, Parrot knows about branching and labels. The branch op is equivalent to Perl's goto:

 
         branch TERRY
JOHN:    print "fjords\n"
         branch END
MICHAEL: print " pining"
         branch GRAHAM
TERRY:   print "It's"
         branch MICHAEL
GRAHAM:  print " for the "
         branch JOHN
END:     end
    

It can also perform simple tests to see whether a register contains a true value:

         set I1, 12
         set I2, 5
         mod I3, I2, I2
         if I3, REMAIND, DIVISOR
REMAIND: print "5 divides 12 with remainder "
         print I3
         branch DONE
DIVISOR: print "5 is an integer divisor of 12"
DONE:    print "\n"
         end
    

Here's what that would look like in Perl, for comparison:

    $i1 = 12;
    $i2 = 5;
    $i3 = $i1 % $i2;
    if ($i3) {
      print "5 divides 12 with remainder ";
      print $i3;
    } else {
      print "5 is an integer divisor of 12";
    }
    print "\n";
    exit;
    

And speaking of comparison, we have the full range of numeric comparators: eq, ne, lt, gt, le and ge. Note that you can't use these operators on arguments of disparate types; you may even need to add the suffix _i or _n to the op to tell it what type of argument you are using - although the assembler ought to divine this for you, by the time you read this.

Pages: 1, 2

Next Pagearrow