For Perl Programmers : only

Have you ever wished that you could have more than one version of a Perl module installed on your system, and that you could easily tell Perl exactly which one you wanted to load? Perhaps you have some legacy programs that only run with older versions of certain modules, while the rest of your system is coded to the more-modern versions of the same module. Or maybe you administer a multiuser system, where your various users request different module versions.

Perl’s built-in mechanism for module version control is very rudimentary. Basically, you can say:

    use XML::Parser 2.27;

The use command will load XML::Parser and then proceed to die unless $XML::Parser::VERSION is greater than or equal to 2.27. There is no way to ask for one specific version or a range of versions. Most of the time this is OK. It’s OK if you assume that successive versions of modules always improve, never drop functionality and are fully backward compatible. Unfortunately, in real life, this isn’t always true.

Take Lincoln Stein’s excellent GD module for example. GD.pm is a toolset for creating graphic images from Perl programs. Up to version 1.19, GD supported the GIF format. Due to GIF’s licensing restrictions, the author was forced to retract support. GIF support was replaced by the PNG format, and later JPEG support was added. If you are required to use GD for both GIF and PNG format types, then you need to do some monkeying around with library paths to get it to work.

With big projects in restrictive environments, dealing with module versioning issues can quickly become a tangled briar patch. When your needs for using specific modules become too thorny, you need a sharper chainsaw. This article describes such a saw.

The Only Way

Introducing only.pm! only is a full featured system for installing and loading multiple versions of Perl modules.

To make sure that your program loads version 1.19 of GD, simply say:

    use only GD => 1.19;

If you know that any of the GD versions from 2.01 up to 2.06 are OK, then say:

    use only GD => '2.01-2.06';

If you also want to import GD::foo and GD::bar, then:

    use only GD => '2.01-2.06', qw(foo bar);

only acts as an extension to Perl’s use command. It intercepts the parameters that you would normally pass to use along with an acceptable version range. It takes all of that information and does a “heap of bit-wrangling” to make sure that you get the version of the module you wanted. In every other respect, it tries to act like a regular use statement.

Easy Installations

How do you go about installing several versions of Perl modules? Perl is really only designed to support the installation of a single module version. Whenever you install a new version of a module, it simply overwrites the older version. This is known as upgrading. Usually that’s what you want, but in the context of only, you actually want to have multiple versions installed simultaneously.

Advanced Perl programmers know that is possible to install modules into different directories. You can supply a PREFIX= parameter to the make install process. But then you have to remember where you installed the module, and manually adjust @INC to have the right sequence of paths. That’s just no fun.

Fortunately, only makes it extremely simple to install multiple versions. To start with, just go about building your module in the usual way:

    tar xvzf Cat-Fancy-0.09.tar.gz
    cd Cat-Fancy-0.09
    perl Makefile.PL
    make test

Normally, the next step is make install. But if you want to store version 0.09 separate from any other version of Cat::Fancy, do this:

    perl -Monly=install

This actually does the same procedure as make install would except that it automatically determines a safe place to stick your module; one that only.pm can find when you say:

    use only Cat::Fancy => 0.09;

Note: If you need authorized access to install modules on your system, then you can do:

    sudo perl -Monly=install

But be careful. The perl used by sudo may not be the same perl as the one used by make. If this is the case, then you’ll need to specify the full path to perl.

Where Is My Only Module?

You may wonder, “Where exactly do these modules get installed and will they conflict with my existing modules?”. The short answer is, “They get installed exactly where you told them to!”.

When you install the only module, you are asked to select a base directory where only::install can install modules. The default value is a modification of your Perl’s sitelib directory. For instance, if your sitelib was:

    /usr/local/perl580/lib/sitelib/5.8.0

then only.pm would default to:

    /usr/local/perl580/lib/version/5.8.0

Even though this is the default, you are prompted to select any directory you want. Your choice is saved permanently in only::config. The constant only::config::versionlib, becomes the base directory where only will install and search for versioned modules.

If you really need to, then you can override this directory as well. For installation, you would say:

    perl -Monly=install - versionlib=/home/ingy/modules

The versionlib is just the base directory. only separates various module versions by sticking them into a subdirectory named after the version. So Cat::Fancy would be installed as:

    /usr/local/perl580/lib/version/5.8.0/0.09/Cat/Fancy.pm

Module::Build

Many new Perl modules are using Module::Build for their build process, instead of the age-old Makefile.PL and ExtUtils::MakeMaker. Module::Build is a wonderfully organized and extensible replacement for its stalwart yet terribly crufty predecessor.

One of the side benefits of modules distributions that have Module::Build is that you can do version-specific installations natively by saying:

    perl Build.PL
    ./Build versioninstall

Although this is just a pass-through call to only::install, it does not suffer from the aforementioned “sudo” problem.

only The::Facts

Back to only.pm. In this section, I’ll discuss all the gory details of the use only syntax. It really isn’t that bad. The basic form is:

    use only MODULE => CONDITION;

where MODULE is a module name, and CONDITION is either a simple version number or a more complex version specification. More on version conditions in a moment. If you have arguments to pass to the module, then simply tack them on to the end:

    use only MODULE => CONDITION, ARGUMENTS;

only even accounts for situations where different versions need different arguments. You match up the conditions and arguments as a set of anonymous arrays:

    use only MODULE =>
        [ CONDITION1, ARGUMENTS1 ],
        ...,
        [ CONDITION7, ARGUMENTS7 ];

Finally, there are some special options that you can pass to only to tell it how to behave. This is accomplished syntactically by passing an anonymous hash of option/value pairs. Put the options hash directly after use only:

    use only { OPTION => VALUE},
        MODULE => CONDITION;

If you want to set the options globally (for all subsequent only interaction), then just specify the options without any other arguments. For example, to override the versionlib option for all use only ... statements, say:

    use only { versionlib => '/home/ingy/modules' };

Hazardous Conditions

Even though a “version condition” can be as simple as a single version number like ‘0.42’, only offers flexible syntax for expressing exactly which versions you are (and aren’t) interested in.

If you want to specify a list of versions, then just use a space-separated enumeration:

    use only Bird::Talk => '0.42 0.44 0.47 0.50';

If your version requirements fall into a range, then you can specify those, too. Just use two versions, separated by a dash:

    use only Bird::Talk => '0.42-0.50';

Of course, you can list multiple ranges as well. And if you leave one of the versions off the range, then that means the range is open-ended.

    use only Bird::Talk => '0.42-0.50 0.55-0.62 0.67-';

Sometimes it’s easier to just specify the versions you don’t want. Using a ‘!’ in front of either a range or a single version, negates that meaning. To avoid all versions of Bird::Talk below 0.42, and also the extremely buggy version .53, say:

    use only Bird::Talk => '!-0.41 !0.53';

When more than one eligible version of Bird::Talk is installed on your system, only always chooses the highest version. If you don’t specify any version, then that is an indication to choose the highest-numbered version. This is different than saying:

    use Bird::Talk;

because only will search your versionlib first.

For Argument’s Sake

There’s not much to say about passing arguments. Just pass them in the same way you would on a normal use statement. This should even work for modules like Inline.pm where the arguments are not import lists:

    use only Inline => 0.44, 'Java';

There is one exception. In Perl, when you say something like:

    use Dog::Walk ();

That’s a cue to not call the module’s import method at all. In other words, it ensures that no functions will be exported into your namespace. But if you say:

    use only Dog::Walk => 1.00, ();

then you will not get the same effect. Unfortunately, there is no way for only.pm to detect that you called it that way. As a workaround, only lets you say:

    use only Dog::Walk => 1.00, [];

This has a similar visual appearance and is meant as a mnemonic. (Hopefully, there aren’t a whole lot of modules in the world where it is important to pass in a single empty array ref :)

The Only Options

The only (no pun intended) option currently implemented for only is versionlib. This option allows you to override the system versionlib stored in only::config.

    use only { versionlib => '/home/ingy/modules' },
        Corn::Stalk => 0.99, [];

Friends and Family

One important duty of only is to ensure that when you load a specific version of some module, all of that module’s related modules are also loaded from the same version level. This is tricky, because in Perl, you never know when a module is going to be loaded. It could be loaded by your original module or not. It might happen at compile time ( use ) or run time ( require ). It could be loaded hours later in a long-running process (or a very, very, very slow computer :) There might also be autoloaded functions involved.

Most importantly, some of the sub-modules might be loaded using use only, while others are loaded with standard use and require statements. To make all this happen the way you’d expect it to, only plays some tricks with @INC. More on that shortly, my preciouses.

only knows which modules are related because it saves the information as metadata for every module it installs. For example, if I install a module like so:

    $ cd YAML-0.35
    $ perl Makefile.PL
    ... lines deleted
    $ make test
    ... lines deleted
    $ perl -Monly=install
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML.pm
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML.pod
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Error.pm
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Family.pm
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Node.pm
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Transfer.pm
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Error.yaml
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Family.yaml
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Node.yaml
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML/Transfer.yaml
    Installing /usr/local/perl580/lib/version/5.8.0/0.35/YAML.yaml

then I get a YAML metadata file for each module. The metadata file YAML.yaml looks like this:

    # This meta file created by/for only.pm
    meta_version: 0.25
    install_version: 0.35
    distribution_name: YAML
    distribution_version: 0.35
    distribution_modules:
      - YAML.pm
      - YAML/Error.pm
      - YAML/Family.pm
      - YAML/Node.pm
      - YAML/Transfer.pm

This way, no matter which module is loaded first, only knows about every other module that was installed with that module.

Only on the Inside

The internals of only.pm are not incredibly complicated, but there is a little black magic going on. Most of it boils down to a relatively new and under-publicized feature of Perl5: putting objects onto the @INC array.

As you probably know, @INC is a special global array of file-system paths. When a program tries to load a Perl module with the use or require commands, it searches each of these paths in order until the module is found. The default paths in @INC are compiled into Perl. You can alter the array with the PERL5LIB environment variable, the lib.pm module, or even by simply changing it with regular Perl array commands. It’s just an array, after all.

As of Perl 5.6.1, you can actually put Perl objects onto @INC and have use and require interact with them. When require encounters an object in @INC it attempts to call that object’s INC method. The INC method can do anything it wants to load the module. It could actually go out on the internet and locate the module, download it, install it and load it!

The INC method should either return a filehandle or nothing. If a filehandle is returned, then require considers the operation a success. It reads the contents of that filehandle and eval-s the module into existence. If nothing is returned, then the operation is considered unsuccessful, and require continues its merry way down @INC.

The heart of the only module’s magic lies in the fact that it puts an object onto @INC that is responsible for loading an appropriate version of your module. Not only that,it is also responsible for loading the matching version of any related modules that were installed at the same time as your module.

O-O-o - Object-Oriented Only

Since only is an object-oriented module on the inside, it is no surprise that it offers an OO API to the those of you on the outside. (I assume that you don’t live inside a Perl module :)

Using the OO interface can give you more understanding and control of the version specific loading process, at the cost of a slightly more verbose syntax specification. As an example, if you would normally do this:

    use only Good::Stuff => '1.20-1.55';

then you can say the same thing by doing this:

    use only;
    my $only;
    BEGIN {
        $only = only->new;
        $only->module('Good::Stuff');
        $only->condition('1.20-1.55');
        $only->include;
    }
    use Good::Stuff;

Note that the use statement is absolutely normal. No only involved. But it still does what we want! That’s because the preceding code sticks one of those magic only objects onto @INC.

The methods should be fairly self-explanatory. The key method call is only-include>. It tells the object to attach itself to the front of @INC.

One nice thing is that you can actually do stuff with the object later on in the program:

    use only;
    my $only;
    BEGIN {    # methods can be stacked
        $only = only->new->module('Good::Stuff')->condition('1.20-1.55')->include;
    }
    use Good::Stuff;
    ...
    print "Using Good::Stuff version: " . $only->distribution_version . "\n";
    ...
    $only->remove;    # Remove object from @INC;

Conclusion

I always believed that if my old Triumph motorcycle ever broke down out on the open road, I could somehow figure out a way to repair it by walking down the road for a half mile in either direction, and finding some odds and ends lying around that I could use for tools. That’s because the roads are usually littered with all sorts of weird things, and I see everything as a tool to somehow suit my needs.

Perl is much like that roadside. There are all kinds of weird things lying around that can help it solve its own problems. Even when Perl is playing the part of a busted Triumph bike, its roadside qualities always seem to be able to kickstart it right back into action.

only.pm is a great example of this. Even though Perl was inadequate in regards to module versioning yesterday, today, it’s packing a brand new chainsaw. Rev it up!

About the Author

Brian Ingerson has been programming for more than 20 years, and hacking Perl for five of those. He is dedicated to improving the overall quality of scripting languages including Perl, Python, PHP and Ruby. He currently hails from Portland, Ore. – the location of this year’s O’Reilly Open Source Convention. How convenient!

Tags

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub