Recently in C and Perl Category

Lightning Strikes Four Times

by Mike Friedman

Good software design principles tell us that we should work to separate unrelated concerns. For example, the popular Model-View-Controller (MVC) pattern is common in web application designs. In MVC, separate modular components form a model, which provides access to a data source, a view, which presents the data to the end user, and a controller, which implements the required features.

Ideally, it's possible to replace any one of these components without breaking the whole system. A templating engine that translates the application's data into HTML (the view) could be replaced with one that generates YAML or a PDF file. The model and controller shouldn't be affected by changing the way that the view presents data to the user.

Other concerns are difficult to separate. In the world of aspect-oriented programming, a crosscutting concern is a facet of a program which is difficult to modularize because it must interact with many disparate pieces of your system.

Consider an application that logs copious trace data when in debugging mode. In order to ensure that it is operating correctly, you may want to log when it enters and exits each subroutine. A typical way to accomplish this is by conditionally executing a logging function based on the value of a constant, which turns debugging on and off.

    use strict;
    use warnings;

    use constant DEBUG => 1;

    sub do_something { 
        log_message("I'm doing something") if DEBUG;

        # do something here

        log_message("I'm done doing something") if DEBUG;
    }

This solution is simple, but it presents a few problems. Perhaps most strikingly, it's simply a lot of code to write. For each subroutine that you want to log, you must write two nearly identical lines of code. In a large system with hundreds or thousands of subroutines, this gets tedious fast, and can lead to inconsistently formatted messages as every copy-paste-edit cycle tweaks them a little bit.

Further, it offends the simple design goal of an MVC framework, because every component must talk to the logging system directly.

One way to improve this technique is to automatically wrap every interesting subroutine in a special logging function. There are a few ways to go about this. One of the simplest is to use subroutine attributes to install a dynamically generated wrapper.

Attributes

Perl 5.6 introduced attributes that allow you to add arbitrary metadata to a variable. Attributes can be attached both to package variables, including subroutines, and lexical variables. Since Perl 5.8, attributes on lexical variables apply at runtime. Attributes on package variables activate at compile-time.

The interface to Perl attributes is via the attributes pragma. (The older attrs is deprecated.) The CPAN module Attribute::Handlers makes working with attributes a bit easier. Here's an example of how you might rewrite the logging system using an attribute handler.

    use strict;
    use warnings;

    use constant DEBUG => 1;

    use Attribute::Handlers;

    sub _log : ATTR(CODE) {
        my ($pkg, $sym, $code) = @_;

        if( DEBUG ) {
            my $name = *{ $sym }{NAME};

            no warnings 'redefine';

            *{ $sym } = sub {
                log_message("Entering sub $pkg\:\:$name");
                my @ret = $code->( @_ );
                log_message("Leaving sub $pkg\:\:$name");
                return @ret;
            };
        }
    }

    sub do_something : _log {
        print "I'm doing something.\n";
    }

Attributes are declared by placing a colon (:) and the attribute name after a variable or subroutine declaration. Optionally, the attribute can receive some data as a parameter; Attribute::Handlers goes to great lengths to massage the passed data for you if necessary.

To set up an attribute handler, the code declares a subroutine, _log, with the ATTR attribute, passing the string CODE as a parameter. Attribute::Handlers provides ATTR, and the CODE parameter tells it that the new handler only applies to subroutines.

During compile time, any subroutine declared with the _log attribute causes Perl to call the attribute handler with several parameters. The first three are the package in which the subroutine was compiled, a reference to the typeglob where its symbol lives, and a reference to the subroutine itself. These are sufficient for now.

If the DEBUG constant is true, the handler sets to work wrapping the newly compiled subroutine. First, it grabs its name from the typeglob, then it adds a new subroutine to its spot in the symbol table. Because the code redefines a package symbol, it's important to turn off warnings for symbol redefinitions in within this block.

Because the new function is a lexical closure over $pkg, $name, and most importantly $code, it can use those values to construct the logging messages and call the original function.

All of this may seem like a lot of work, but once it's done, all you need to do to enable entry and exit logging for any function is to simply apply the _log attribute. The logging messages themselves get manufactured via closures when the program compiles, so we know they'll always be consistent. If you want to change them, you only have to do it in one place.

Best of all, because attribute handlers get inherited, if you define your handler in a base class, any subclass can use it.

Caveats

Although this is a powerful technique, it isn't perfect. The code will not properly wrap anonymous subroutines, and it won't necessarily propagate calling context to the wrapped functions. Further, using this technique will significantly increase the number of subroutine dispatches that your program must execute during runtime. Depending on your program's complexity, this may significantly increase the size of your call stack. If blinding speed is a major design goal, this strategy may not be for you.

Going Further

Other common cross-cutting concerns are authentication and authorization systems. Subroutine attributes can wrap functions in a security checker that will refuse to call the functions to callers without the proper credentials.

Perl Outperforms C with OpenGL

by Bob Free

Desktop developers often assume that compiled languages always perform better than interpreted languages such as Perl.

Conversely, most LAMP online service developers are familiar with mechanisms for preloading Perl interpreters modules (such as Apache mod_perl and ActivePerl/ISAPI), and know that Perl performance often approaches that of C/C++.

However, few 3D developers think of Perl when it comes to performance. They should.

GPUs are increasingly taking the load off of CPUs for number-crunching. Modern GPGPU processing leverages C-like programs and loads large data arrays onto the GPU, where processing executes independent of the CPU. As a result, the overall contribution of CPU-bound programs diminish, while Perl and C differences become statistically insignificant in terms of GPU performance.

The author has recently published a open source update to CPAN's OpenGL module, adding support for GPGPU features. With this release, he has also posted OpenGL Perl versus C benchmarks--demonstrating cases where Perl outperforms C for OpenGL operations.

What Is OpenGL?

OpenGL is an industry-standard, cross-platform language for rendering 3D images. Originally developed by Silicon Graphics Inc. (SGI), it is now in wide use for 3D CAD/GIS systems, game development, and computer graphics (CG) effects in film.

With the advent of Graphic Processing Units (GPU), realistic, real-time 3D rendering has become common--even in game consoles. GPUs are designed to process large arrays of data, such as 3D vertices, textures, surface normals, and color spaces.

It quickly became clear that the GPU's ability to process large amounts of data could expand well beyond just 3D rendering, and could applied to General Purpose GPU (GPGPU) processing. GPGPUs can process complex physics problems, deal with particle simluations, provide database analytics, etc.

Over the years, OpenGL has expanded to support GPGPU processing, making it simple to load C-like programs into GPU memory for fast execution, to load large arrays of data in the form of textures, and to quickly move data between the GPU and CPU via Frame Buffer Objects (FBO).

While OpenGL is in itself a portable language, it provides no interfaces to operating system (OS) display systems. As a result, Unix systems generally rely on an X11-based library called GLX; Windows relies on a WGL interface. Several libraries, such as GLUT, help to abstract these differences. However, as OpenGL added new extensions, OS vendors (Microsoft in particular) provided different methods for accessing the new APIs, making it difficult to write cross-platform GPGPU code.

Perl OpenGL (POGL)

Bob Free of Graphcomp has just released a new, portable, open source Perl OpenGL module (POGL 0.55).

This module adds support for 52 new OpenGL extensions, including many GPGPU features such as Vertex Arrays, Frame Buffer Objects, Vertext Programs, and Fragment Programs.

In terms of 3D processing, these extensions allow developers to perform real-time dynamic vertex and texturemap generation and manipulation within the GPU. This module also simplifies GPGPU processing by moving data to and from the CPU through textures, and loading low-level, assembly-like instructions to the GPU.

POGL 0.55 is a binary Perl module (written in C via XS), that has been tested on Windows (NT/XP/Vista) and Linux (Fedora 6. Ubuntu/Dapper). Source and binaries are available via SVN, PPM, tarball, and ZIP at the POGL homepage.

POGL OS Performance

The POGL homepage includes initial benchmarks comparing POGL on Vista, Fedora, and Ubuntu. These tests show that static texture rendering on an animated object on Fedora was 10x faster than Vista; Ubuntu was 15x faster (using the same nVidia cards, drivers, and machine).

A subsequent, tighter benchmark eliminated UI and FPS counters, and focused more on dynamic texturemap generation. These results, posted on OpenGL C versus Perl benchmarks, show comparable numbers for Fedora and Ubuntu, with both outperforming Vista by about 60 percent.

Note: a further performance on these benchmarks could be available through the use of GPU vertex arrays.

Perl versus C Performance

These benchmarks also compare Perl against C code. It found no statistical difference between overall Perl and C performance on Linux. Inexplicably, Perl frequently outperformed C on Vista.

In general, C performed better than Perl on Vertex/Fragment Shader operations, while Perl outperformed C on FBO operations. In this benchmark, overall performance was essentially equal between Perl and C.

The similarity in performance is explained by several factors:

  • GPU is performing the bulk of the number-crunching operations
  • POGL is a compiled C module
  • Non-GPU operations are minimal

In cases where code dynamically generates or otherwise modifies the GPU's vetex/fragment shader code, it is conceivable that Perl would provide even better than C, due to Perl's optimized and interpreted string handling.

Perl Advantages

Given that GPGPU performance will be a wash in most cases, the primary reason for using a compiled language is to obfuscate source for intellectual property (IP) reasons.

For server-side development, there's really no reason to use a compiled language for GPGPU operations, and several reasons to go with Perl:

  • Perl OpenGL code is more portable than C; therefore there are fewer lines of code
  • Numerous imaging modules for loading GPGPU data arrays (textures)
  • Portable, open source modules for system and auxiliary functions
  • Perl (under mod-perl/ISAPI) is generally faster than Java
  • It is easier to port Perl to/from C than Python or Ruby
  • As of this writing, there is no FBO support in Java, Python, or Ruby

There is a side-by-side code comparison between C and Perl posted on the above benchmark page.

Desktop OpenGL/GPU developers may find it faster to prototype code in Perl (e.g., simpler string handling and garbage collection), and then port their code to C later (if necessary). Developers can code in one window and execute in another--with no IDE, no compiling--allowing innovators/researchers to do real-time experiments with new shader algorithms.

Physicists can quickly develop new models; researchers and media developers can create new experimental effects and reduce their time to market.

Summary

Performance is not a reason a reason to use C over Perl for OpenGL and GPGPU operations, and there are many cases where Perl is preferable to C (or Java/Python/Ruby).

By writing your OpenGL/GPU code in Perl, you will likely:

  • Reduce your R&D costs and time to market
  • Expand your platform/deployment options
  • Accelerate your company's GPGPU ramp up

Using Test::Count

by Shlomi Fish

A typical Test::More test script contains several checks. It is preferable to keep track of the number of checks that the script is running (using use Test::More tests => $NUM_CHECKS or the plan tests => $NUM_CHECKS), so that if some checks are not run (for whatever reason), the test script will still fail when being run by the harness.

If you add more checks to a test file, then you have to remember to update the plan. However, how do you keep track of how many tests should run? I've already encountered a case where a DBI related module had a different number of tests with an older version of DBI than with a more recent one.

Enter Test::Count. Test::Count originated from a Vim script I wrote to keep track of the number of tests by using meta-comments such as # TEST (for one test) or # TEST*3*5 (for 15 tests). However, there was a limit to what I could do with Vim's scripting language, as I wanted a richer syntax for specifying the tests as well as variables.

Thus, I wrote the Test::Count module and placed it on CPAN. Test::Count::Filter acts as a filter, counts the tests, and updates them. Here's an example, taken from a code I wrote for a Perl Quiz of the Week:

#!/usr/bin/perl -w

# This file implements various functions to remove
# all periods ("."'s) except the last from a string.

use strict;

use Test::More tests => 5;
use String::ShellQuote;

sub via_split
{
    my $s = shift;
    my @components = split(/\./, $s, -1);
    if (@components == 1)
    {
        return $s;
    }
    my $last = pop(@components);
    return join("", @components) . "." . $last;
}

# Other Functions snipped.

# TEST:$num_tests=9
# TEST:$num_funcs=8
# TEST*$num_tests*$num_funcs
foreach my $f (@funcs)
{
    my $ref = eval ("\\&$f");
    is($ref->("hello.world.txt"), "helloworld.txt", "$f - simple"); # 1
    is($ref->("hello-there"), "hello-there", "$f - zero periods"); # 2
    is($ref->("hello..too.pl"), "hellotoo.pl", "$f - double"); # 3
    is($ref->("magna..carta"), "magna.carta", "$f - double at end"); # 4
    is($ref->("the-more-the-merrier.jpg"),
       "the-more-the-merrier.jpg", "$f - one period"); # 5
    is($ref->("hello."), "hello.", "$f - one period at end"); # 6
    is($ref->("perl.txt."), "perltxt.", "$f - period at end"); # 7
    is($ref->(".yes"), ".yes", "$f - one period at start"); # 8
    is($ref->(".yes.txt"), "yes.txt", "$f - period at start"); # 9
}

Filtering this script through Test::Count::Filter provides the correct number of tests. I then add this to my .vimrc:

function! Perl_Tests_Count()
    %!perl -MTest::Count::Filter -e 'Test::Count::Filter->new({})->process()'
endfunction

autocmd BufNewFile,BufRead *.t map <F3> :call Perl_Tests_Count()<CR>

Now I can press F3 to update the number of checks.

Test::Count supports +,-,*, /, as well as parentheses, so it is expressive enough for most needs.

Acknowledgements

Thanks to mrMister from Freenode for going over earlier drafts of this article and correcting some problems.

What's In that Scalar?

by brian d foy

Scalars are simple, right? They hold single values, and you don't even have to care what those values are because Perl figures out if they are numbers or strings. Well, scalars show up just about anywhere and it's much more complicated than single values. I could have undef, a number or string, or a reference. That reference can be a normal reference, a blessed reference, or even a hidden reference as a tied variable.

Perhaps I have a scalar variable which should be an object (a blessed reference, which is a single value), but before I call a method on it I want to ensure it is to avoid the "unblessed reference" error that kills my program. I might try the ref built-in to get the class name:

   if( ref $maybe_object ) { ... }

There's a bug there. ref returns an empty string if the scalar isn't an object. It might return 0, a false value, and yes, some Perl people have figured out how to create a package named 0 just to mess with this. I might think that checking for defined-ness would work:

   if( defined ref $maybe_object ) { ... }

... but the empty string is also defined. I want all the cases where it is not the one value that means it's not a reference.

   unless( '' eq ref $maybe_object ) { ... }

This still doesn't tell me if I have an object. I know it's a reference, but maybe it's a regular data reference. The blessed function from Scalar::Util can help:

   if( blessed $maybe_object ) { ... }

This almost has the same problem as ref. blessed returns the package name if it's an object, and undef otherwise. I really need to check for defined-ness.

   if( defined blessed $maybe_object ) { ... }

Even if blessed returns undef, I still might have a hidden object. If the scalar is a tied variable, there's really an object underneath it, although the scalar acts as if it's a normal variable. Although I normally don't need to interact with the secret object, the tied built-in returns the secret object if there is one, and undef otherwise.

        my $secret_object = tied $maybe_tied_scalar;

        if( defined $secret_object ) { ... }

Once I have the secret object in $secret_object, I treat it like any other object.

Now I'm sure I have an object, but that doesn't mean I know which methods I can call. The isa function in the UNIVERSAL package supposedly can figure this out for me. It tells me if a class is somewhere in an object's inheritance tree. I want to know if my object can do what a Horse can do, even if I have a RaceHorse:

   if( UNIVERSAL::isa( $object, 'RaceHorse' ) {
           $object->method;
           }

...what if the RaceHorse class is just a factory for objects in some other class that I'm not supposed to know about? I'll make a new object as a prototype just to get its reference:

   if( UNIVERSAL::isa( $object, ref RaceHorse->new() ) {
           $object->method;
           }

A real object-oriented programmer doesn't care what sort of object it is as long as it can respond to the right method. I should use can instead:

   if( UNIVERSAL::can( $object, $method ) {
           $object->method;
           }

This doesn't always work either. can only knows about defined subroutine names, and only looks in the inheritance tree for them. It can't detect methods from AUTOLOAD or traits. I could override the can method to handle those, but I have to call it as a method (this works for isa too):

   if( $object->can( $method ) ) {
           $object->method;
           }

What if $object wasn't really an object? I just called a method on a non-object! I'm back to my original problem, but I don't want to use all of those tests I just went through. I'll fix this with an eval, which catches the error for non-objects:

   if( eval{ $object->can( $method ) } ) {
           $object->method;
           }

...but what if someone installed a __DIE__ handler that simply exit-ed instead of die-ing? Programmers do that sort of thing forgetting that it affects the entire program.

   $SIG{__DIE__} = sub { exit };

Now my eval tries to die because it caught the error, but the __DIE__ handler says exit, so the program stops without an error. I have to localize the __DIE__ handler:

   if( eval{ local $SIG{__DIE__}; $object->can( $method ) } ) {
           $object->method;
           }

If I'm the guy responsible for the __DIE__ handler, I could use $^S to see if I'm in an eval:

   $SIG{__DIE__} = sub { $^S ? die : exit };

That's solved it, right? Not quite. Why do all of that checking? I can just call the method and hope for the best. If I get an error, so be it:

   my $result = eval { $object->method };

Now I have to wrap all of my method calls in an eval. None of this would really be a problem if Perl were an object language. Or is it? The autobox module makes Perl data types look like objects:

   use autobox;

   sub SCALAR::println { print $_[0], "\n" }

   'Hello World'->println;

That works because it uses a special package SCALAR, although I need to define methods in it myself. I'll catch unknown methods with AUTOLOAD:

   sub SCALAR::AUTOLOAD {}

Or, I can just wait for Perl 6 when these things get much less murky because everything is an object.

Why Not Translate Perl to C?

People often have the idea that automatically translating Perl to C and then compiling the C will make their Perl programs run faster, because "C is much faster than Perl." This article explains why this strategy is unlikely to work.

Short Summary

Your Perl program is being run by the Perl interpreter. You want a C program that does the same thing that your Perl program does. A C program to do what your Perl program does would have to do most of the same things that the Perl interpreter does when it runs your Perl program. There is no reason to think that the C program could do those things faster than the Perl interpreter does them, because the Perl interpreter itself is written in very fast C.

Some detailed case studies follow.

Built-In Functions

Suppose your program needs to split a line into fields, and uses the Perl split function to do so. You want to compile this to C so it will be faster.

This is obviously not going to work, because the split function is already implemented in C. If you have the Perl source code, you can see the implementation of split in the file pp.c; it is in the function named pp_split. When your Perl program uses split, Perl calls this pp_split function to do the splitting. pp_split is written in C, and it has already been compiled to native machine code.

Now, suppose you want to translate your Perl program to C. How will you translate your split call? The only thing you can do is translate it to a call to the C pp_split function, or some other equivalent function that splits. There is no reason to believe that any C implementation of split will be faster than the pp_split that Perl already has. Years of work have gone into making pp_split as fast as possible.

You can make the same argument for all of Perl's other built-in functions, such as join, printf, rand and readdir.

So much for built-in functions.

Data Structures

Why is Perl slow to begin with? One major reason is that its data structures are extremely flexible, and this flexibility imposes a speed penalty.

Let's look in detail at an important example: strings. Consider this Perl code:

        $x = 'foo';     
        $y = 'bar';
        $x .= $y;

That is, we want to append $y to the end of $x. In C, this is extremely tricky. In C, you would start by doing something like this:

        char *x = "foo";
        char *y = "bar";

Now you have a problem. You would like to insert bar at the end of the buffer pointed to by x. But you can't, because there is not enough room; x only points to enough space for four characters, and you need space for seven. (C strings always have an extra nul character on the end.) To append y to x, you must allocate a new buffer, and then arrange for x to point to the new buffer:

        char *tmp = malloc(strlen(x) + strlen(y) + 1);
        strcpy(tmp, x);
        strcat(tmp, y);
        x = tmp;

This works fine if x is the only pointer to that particular buffer. But if some other part of the program also had a pointer to the buffer, this code does not work. Why not? Here's the picture of what we did:

BEFORE:

Here x and z are two variables that both contain pointers to the same buffer. We want to append bar to the end of the string. But the C code we used above doesn't quite work, because we allocated a new region of memory to hold the result, and then pointed x to it:

AFTER x = tmp:

It's tempting to think that we should just point z to the new buffer also, but in practice this is impossible. The function that is doing the appending cannot know whether there is such a z, or where it may be. There might be 100 variables like z all pointing to the old buffer, and there is no good way to keep track of them so that they can all be changed when the array moves.

Perl does support a transparent string append operation. Let's see how this works. In Perl, a variable like $x does not point directly at the buffer. Instead, it points at a structure called an SV. ('Scalar Value') The SV has the pointer to the buffer, and also some other things that I do not show:

BEFORE $x .= $y

When you ask Perl to append bar to $x, it follows the pointers and finds that there is not enough space in the buffer. So, just as in C, it allocates a new buffer and stores the result in the new buffer. Then it fixes the pointer in the SV to point to the new buffer, and it throws away the old buffer:

Now $x and $z have both changed. If there were any other variables sharing the SV, their values would have changed also. This technique is called "double indirection,'" and it is how Perl can support operations like .=. A similar principle applies for arrays; this is how Perl can support the push function.

The flexibility comes at a price: Whenever you want to use the value of $x, Perl must follow two pointers to get the value: The first to find the SV structure, and the second to get to the buffer with the character data. This means that using a string in Perl takes at least twice as long as in C. In C, you follow just one pointer.

If you want to compile Perl to C, you have a big problem. You would like to support operations like .= and push, but C does not support these very well. There are only three solutions:

  1. Don't support .=

    This is a bad solution, because after you disallow all the Perl operations like .= and push what you have left is not very much like Perl; it is much more like C, and then you might as well just write the program in C in the first place.

  2. Do something extremely clever

    Cleverness is in short supply this month. :)

  3. Use a double-indirection technique in the compiled C code

    This works, but the resulting C code will be slow, because you will have to traverse twice as many pointers each time you want to look up the value of a variable. But that is why Perl is slow! Perl is already doing the double-indirection lookup in C, and the code to do this has already been compiled to native machine code.

So again, it's not clear that you are going to get any benefit from translating Perl to C. The slowness of Perl comes from the flexibility of the data structures. The code to manipulate these structures is already written in C. If you translate a Perl program to C, you have the choice of throwing away the flexibility of the data structure, in which case you are now writing C programs with C structures, or keeping the flexibility with the same speed penalty. You probably cannot speed up the data structures, because if anyone knew how to make the structures faster and still keep them flexible, they would already have made those changes in the C code for Perl itself.

Possible Future Work

Related Articles

Larry Wall Apocalypse 2

Damian Conway Exegesis 2

perl6-internals mailing list archive

It should now be clear that although it might not be hard to translate Perl to C, programs probably will not be faster as a result.

However, it's possible that a sufficiently clever person could make a Perl-to-C translator that produced faster C code. The programmer would need to give hints to the translator to say how the variables were being used. For example, suppose you have an array @a. With such an array, Perl is ready for anything. You might do $a[1000000] = 'hello'; or $a[500] .= 'foo'; or $a[500] /= 17;. This flexibility is expensive. But suppose you know that this array will only hold integers and there will never be more than 1,000 integers. You might tell the translator that, and then instead of producing C code to manage a slow Perl array, the translator can produce

        int a[1000];

and use a fast C array of machine integers.

To do this, you have to be very clever and you have to think of a way of explaining to the translator that @a will never be bigger than 1,000 elements and will only contain integers, or a way for the translator to guess that just from looking at the Perl program.

People are planning these features for Perl 6 right now. For example, Larry Wall, the author of Perl, plans that you will be able to declare a Perl array as

        my int @a is dim(1000);

Then a Perl-to-C translator (or Perl itself) might be able to use a fast C array of machine integers rather than a slow Perl array of SVs. If you are interested, you may want to join the perl6-internals mailing list.

Pathologically Polluting Perl

Pathologically Polluting Perl


Table of Contents

Inline in Action - Simple examples in C
Hello, world
Just Another ____ Hacker
What about XS and SWIG?
One-Liners
Supported Platforms for C
The Inline Syntax
Fine Dining - A Glimpse at the C Cookbook
External Libraries
It Takes All Types
Some Ware Beyond the C
See Perl Run. Run, Perl, Run!
The Future of Inline
Conclusion

No programming language is Perfect. Perl comes very close. P! e! r! l? :-( Not quite ``Perfect''. Sometimes it just makes sense to use another language for part of your work. You might have a stable, pre-existing code base to take advantage of. Perhaps maximum performance is the issue. Maybe you just ``know how to do it'' that way. Or very likely, it's a project requirement forced upon you by management. Whatever the reason, wouldn't it be great to use Perl most of the time, but be able to invoke something else when you had to?

Inline.pm is a new module that glues other programming languages to Perl. It allows you to write C, C++, and Python code directly inside your Perl scripts and modules. This is conceptually similar to the way you can write inline assembly language in C programs. Thus the name: Inline.pm.

The basic philosophy behind Inline is this: ``make it as easy as possible to use Perl with other programming languages, while ensuring that the user's experience retains the DWIMity of Perl''. To accomplish this, Inline must do away with nuisances such as interface definition languages, makefiles, build directories and compiling. You simply write your code and run it. Just like Perl.

Inline will silently take care of all the messy implementation details and ``do the right thing''. It analyzes your code, compiles it if necessary, creates the correct Perl bindings, loads everything up, and runs the whole schmear. The net effect of this is you can now write functions, subroutines, classes, and methods in another language and call them as if they were Perl.

Inline in Action - Simple examples in C

Inline addresses an old problem in a completely revolutionary way. Just describing Inline doesn't really do it justice. It should be seen to be fully appreciated. Here are a couple examples to give you a feel for the module.

Hello, world

It seems that the first thing any programmer wants to do when he learns a new programming technique is to use it to greet the Earth. In keeping with that tradition, here is the ``Hello, world'' program using Inline.

    use Inline C => <<'END_C';
    void greet() {
        printf("Hello, world\n");
    }
    END_C
    greet;

Simply run this script from the command line and it will print (you guessed it):

    Hello, world

In this example, Inline.pm is instantiated with the name of a programming language, ``C'', and a string containing a piece of that language's source code. This C code defines a function called greet() which gets bound to the Perl subroutine &main::greet. Therefore, when we call the greet() subroutine, the program prints our message on the screen.

You may be wondering why there are no #include statements for things like stdio.h? That's because Inline::C automatically prepends the following lines to the top of your code:

    #include "EXTERN.h"
    #include "perl.h"
    #include "XSUB.h"
    #include "INLINE.h"

These header files include all of the standard system header files, so you almost never need to use #include unless you are dealing with a non-standard library. This is in keeping with Inline's philosophy of making easy things easy. (Where have I heard that before?)

Just Another ____ Hacker

The next logical question is, ``How do I pass data back and forth between Perl and C?'' In this example we'll pass a string to a C function and have it pass back a brand new Perl scalar.

    use Inline C;
    print JAxH('Perl');

    __END__
    __C__
    SV* JAxH(char* x) {
        return newSVpvf("Just Another %s Hacker\n", x);
    }

When you run this program, it prints:

    Just Another Perl Hacker

You've probably noticed that this example is coded differently then the last one. The use Inline statement specifies the language being used, but not the source code. This is an indicator for Inline to look for the source at the end of the program, after the special marker '__C__'.

The concept being demonstrated is that we can pass Perl data in and out of a C function. Using the default Perl type conversions, Inline can easily convert all of the basic Perl data types to C and vice-versa.

This example uses a couple of the more advanced concepts of Inlining. Its return value is of the type SV* (or Scalar Value). The Scalar Value is the most common Perl internal type. Also, the Perl internal function newSVpfv() is called to create a new Scalar Value from a string, using the familiar sprintf() syntax. You can learn more about simple Perl internals by reading the perlguts and perlapi documentation distributed with Perl.

What about XS and SWIG?

Let's detour momentarily to ponder ``Why Inline?''

There are already two major facilities for extending Perl with C. They are XS and SWIG. Both are similar in their capabilities, at least as far as Perl is concerned. And both of them are quite difficult to learn compared to Inline. Since SWIG isn't used in practice to nearly the degree that XS is, I'll only address XS.

There is a big fat learning curve involved with setting up and using the XS environment. You need to get quite intimate with the following docs:

 * perlxs
 * perlxstut
 * perlapi
 * perlguts
 * perlcall
 * perlmod
 * h2xs
 * xsubpp
 * ExtUtils::MakeMaker

With Inline you can be up and running in minutes. There is a C Cookbook with lots of short but complete programs that you can extend to your real-life problems. No need to learn about the complicated build process going on in the background. You don't even need to compile the code yourself. Perl programmers cannot be bothered with silly things like compiling. ``Tweak, Run, Tweak, Run'' is our way of life. Inline takes care of every last detail except writing the C code.

Another advantage of Inline is that you can use it directly in a script. As we'll soon see, you can even use it in a Perl one-liner. With XS and SWIG, you always set up an entirely separate module, even if you only have one or two functions. Inline makes easy things easy, and hard things possible. Just like Perl.

Finally, Inline supports several programming languages (not just C and C++). As of this writing, Inline has support for C, C++, Python, and CPR. There are plans to add many more.

One-Liners

Perl is famous for its one-liners. A Perl one-liner is short piece of Perl code that can accomplish a task that would take much longer in another language. It is one of the popular techniques that Perl hackers use to flex their programming muscles.

So you may wonder: ``Is Inline powerful enough to produce a one-liner that is also bonifide C extension?'' Of course it is! Here you go:

    perl -e 'use Inline C=>
	q{void J(){printf("Just Another Perl Hacker\n");}};J'

Try doing that with XS! We can even write the more complex Inline JAxH() discussed earlier as a one-liner:

    perl -le 'use Inline C=>
	q{SV*JAxH(char*x){return newSVpvf("Just Another %s Hacker",x);}};print JAxH+Perl'

I have been using this one-liner as my email signature for the past couple months. I thought it was pretty cool until Bernhard Muenzer posted this gem to comp.lang.perl.modules:

    #!/usr/bin/perl -- -* Nie wieder Nachtschicht! *- -- lrep\nib\rsu\!#
    use Inline C=>'void C(){int m,u,e=0;float l,_,I;for(;1840-e;putchar((++e>907
     &&942>e?61-m:u)["\n)moc.isc@rezneumb(rezneuM drahnreB"]))for(u=_=l=0;79-(m
      =e%80)&&I*l+_*_<6&&26-++u;_=2*l*_+e/80*.09-1,l=I)I=l*l-_*_-2+m/27.;}';&C

Supported Platforms for C

Inline C works on all of the Perl platforms that I have tested it with so far. This includes all common Unixes and recent versions of Microsoft Windows. The only catch is that you must have the same compiler and make utility that was used to build your perl binary.

Inline has been successfully used on Linux, Solaris, AIX, HPUX, and all the recent BSD's.

There are two common ways to use Inline on MS Windows. The first one is with ActiveState's ActivePerl for MSWin32. In order to use Inline in that environment, you'll need a copy of MS Visual C++ 6.0. This comes with the cl.exe compiler and the nmake make utility. Actually these are the only parts you need. The visual components aren't necessary for Inline.

The other alternative is to use the Cygwin utilities. This is an actual Unix porting layer for Windows. It includes all of the most common Unix utilities, such as bash, less, make, gcc and of course perl.

The Inline Syntax

Inline is a little bit different than most of the Perl modules that you are used to. It doesn't import any functions into your namespace and it doesn't have any object oriented methods. Its entire interface is specified through 'use Inline ...' commands. The general Inline usage is:

    use Inline C => source-code,
               config_option => value,
               config_option => value;

Where C is the programming language, and source-code is a string, filename, or the keyword 'DATA'. You can follow that with any number of optional 'keyword => value' configuration pairs. If you are using the 'DATA' option, with no configuration parameters, you can just say:

    use Inline C;

Fine Dining - A Glimpse at the C Cookbook

In the spirit of the O'Reilly book ``Perl Cookbook'', Inline provides a manpage called C-Cookbook. In it you will find the recipes you need to help satisfy your Inline cravings. Here are a couple of tasty morsels that you can whip up in no time. Bon Appetit!

External Libraries

The most common real world need for Inline is probably using it to access existing compiled C code from Perl. This is easy to do. The secret is to write a wrapper function for each function you want to expose in Perl space. The wrapper calls the real function. It also handles how the arguments get passed in and out. Here is a short Windows example that displays a text box with a message, a caption and an ``OK'' button:

    use Inline C => DATA =>
               LIBS => '-luser32',
               PREFIX => 'my_';
    MessageBoxA('Inline Message Box', 'Just Another Perl Hacker');

    __END__
    __C__
    #include <windows.h>
    int my_MessageBoxA(char* Caption, char* Text) {
      return MessageBoxA(0, Text, Caption, 0);
    }

This program calls a function from the MSWin32 user32.dll library. The wrapper determines the type and order of arguments to be passed from Perl. Even though the real MessageBoxA() needs four arguments, we can expose it to Perl with only two, and we can change the order. In order to avoid namespace conflicts in C, the wrapper must have a different name. But by using the PREFIX option (same as the XS PREFIX option) we can bind it to the original name in Perl.

It Takes All Types

Older versions of Inline only supported five C data types. These were: int, long, double, char* and SV*. This was all you needed. All the basic Perl scalar types are represented by these. Fancier things like references could be handled by using the generic SV* (scalar value) type, and then doing the mapping code yourself, inside the C function.

The process of converting between Perl's SV* and C types is called typemapping. In XS, you normally do this by using typemap files. A default typemap file exists in every Perl installation in a file called /usr/lib/perl5/5.6.0/ExtUtils/typemap or something similar. This file contains conversion code for over 20 different C types, including all of the Inline defaults.

As of version 0.30, Inline no longer has any built in types. It gets all of its types exclusively from typemap files. Since it uses Perl's default typemap file for its own defaults, it actually has many more types available automatically.

This setup provides a lot of flexibility. You can specify your own typemap files through the use of the TYPEMAPS configuration option. This not only allows you to override the defaults with your own conversion code, but it also means that you can add new types to Inline as well. The major advantage to extending the Inline syntax this way is that there are already many typemaps available for various APIs. And if you've done your own XS coding in the past, you can use your existing typemap files as is. No changes are required.

Let's look at a small example of writing your own typemaps. For some reason, the C type float is not represented in the default Perl typemap file. I suppose it's because Perl's floating point numbers are always stored as type double, which is higher precision than float. But if we wanted it anyway, writing a typemap file to support float is trivial.

Here is what the file would look like:

    float                   T_FLOAT

    INPUT
    T_FLOAT
            $var = (float)SvNV($arg)

    OUTPUT
    T_FLOAT
            sv_setnv($arg, (double)$var);

Without going into details, this file provides two snippets of code. One for converting a SV* to a float, and one for the opposite. Now we can write the following script:

    use Inline C => DATA =>
               TYPEMAPS => './typemap';

    print '1.2 + 3.4 = ', fadd(1.2, 3.4), "\n";

    __END__
    __C__
    float fadd(float x, float y) {
        return x + y;
    }

Some Ware Beyond the C

The primary goal of Inline is to make it easy to use other programming languages with Perl. This is not limited to C. The initial implementations of Inline only supported C, and the language support was built directly into Inline.pm. Since then things have changed considerably. Inline now supports multiple languages of both compiled and interpreted nature. And it keeps the implementations in an object oriented type structure, whereby each language has its own separate module, but they can inherit behavior from the base Inline module.

On my second day working at ActiveState, a young man approached me. ``Hi, my name is Neil Watkiss. I just hacked your Inline module to work with C++.''

Neil, I soon found out, was a computer science student at a local university. He was working part-time for ActiveState then, and had somehow stumbled across Inline. I was thrilled! I had wanted to pursue new languages, but didn't know how I'd find the time. Now I was sitting 15 feet away from my answer!

Over the next couple months, Neil and I spent our spare time turning Inline into a generic environment for gluing new languages to Perl. I ripped all the C specific code out of Inline and put it into Inline::C. Neil started putting together Inline::CPP and Inline::Python. Together we came up with a new syntax that allowed multiple languages and easier configuration.

Here is a sample program that makes uses of Inline Python:

    use Inline Python;
    my $language = shift;
    print $language, 
          (match($language, 'Perl') ? ' rules' : ' sucks'),
          "!\n";
    __END__
    __Python__
    import sys
    import re
    def match(str, regex):
        f = re.compile(regex);
        if f.match(str): return 1
        return 0

This program uses a Python regex to show that ``Perl rules!''.

Since Python supports its own versions of Perl scalars, arrays, and hashes, Inline::Python can flip-flop between them easily and logically. If you pass a hash reference to python, it will turn it into a dictionary, and vice-versa. Neil even has mechanisms for calling back to Perl from Python code. See the Inline::Python docs for more info.

See Perl Run. Run Perl, Run!

Inline is a great way to write C extensions for Perl. But is there an equally simple way to embed a Perl interpreter in a C program? I pondered this question myself one day. Writing Inline functionality for C would not be my cup of tea.

The normal way to embed Perl into C involves jumping through a lot of hoops to bootstrap a perl interpreter. Too messy for one-liners. And you need to compile the C. Not very Inlinish. But what if you could pass your C program to a perl program that could pass it to Inline? Then you could write this program:

    #!/usr/bin/cpr
    int main(void) {
        printf("Hello, world\n");
    }

and just run it from the command line. Interpreted C!

And thus, a new programming language was born. CPR. ``C Perl Run''. The Perl module that gives it life is called Inline::CPR.

Of course, CPR is not really its own language, in the strict sense. But you can think of it that way. CPR is just like C except that you can call out to the Perl5 API at any time, without any extra code. In fact, CPR redefines this API with its own CPR wrapper API.

There are several ways to think of CPR: ``a new language'', ``an easy way to embed Perl in C'', or just ``a cute hack''. I lean towards the latter. CPR is probably a far stretch from meeting most peoples embedding needs. But at the same time its a very easy way to play around with, and perhaps redefine, the Perl5 internal API. The best compliment I've gotten for CPR is when my ActiveState coworker Adam Turoff said, ``I feel like my head has just been wrapped around a brick''. I hope this next example makes you feel that way too:

    #!/usr/bin/cpr
    int main(void) {
        CPR_eval("use Inline (C => q{
            char* greet() {
                return \"Hello world\";
            }
        })");
        printf("%s, I'm running under Perl version %s\n",
               CPR_eval("&greet"),
               CPR_eval("use Config; $Config{version}"));
        return 0;
    }

Running this program prints:

    Hello world, I'm running under Perl version 5.6.0

Using the eval() call this CPR program calls Perl and tells it to use Inline C to add a new function, which the CPR program subsequently calls. I think I have a headache myself.

The Future of Inline

Inline version 0.30 was written specifically so that it would be easy for other people in the Perl community to contribute new language bindings for Perl. On the day of that release, I announced the birth of the Inline mailing list, inline@perl.org. This is intended to be the primary forum for discussion on all Inline issues, including the proposal of new features, and the authoring of new ILSMs.

In the year 2001, I would like to see bindings for Java, Ruby, Fortran and Bash. I don't plan on authoring all of these myself. But I may kickstart some of them, and see if anyone's interested in taking over. If you have a desire to get involved with Inline development, please join the mailing list (inline-subscribe@perl.org) and speak up.

My primary focus at the present time, is to make the base Inline module as simple, flexible, and stable as possible. Also I want to see Inline::C become an acceptable replacement for XS; at least for most situations.

Conclusion

Using XS is just too hard. At least when you compare it to the rest of the Perl we know and love. Inline takes advantage of the existing frameworks for combining Perl and C, and packages it all up into one easy to swallow pill. As an added bonus, it provides a great framework for binding other programming languages to Perl. You might say, ``It's a 'Perl-fect' solution!''

Visit the home of the Perl programming language: Perl.org

Sponsored by

Powered by Movable Type 5.02