Listen Print

Where Wizards Fear To Tread

Perl 5.8 Threads

by Artur Bergman
June 11, 2002

One of the big new features in perl 5.8 is that we now have real working threads available to us through the threads pragma.

However, for us module authors who already have to support our modules on different versions of perl and different platforms, we now have to deal with another case: threads! This article will show you how threads relate to modules, how we can take old modules and make them thread-safe, and round off with a new module that alters perl's behavior of the "current working directory".

To run the examples I have shown here, you need perl 5.8 RC1 or later compiled with threads. On Unix, you can use Configure -Duseithreads -Dusethreads; On Win32, the default build will always have threading enabled.

How do threads relate to modules?

Threading in Perl is based on the notion of explicit shared data. That is, only data that is explicitly requested to be shared will be shared between threads. This is controlled by the threads::shared pragma and the ": shared" attribute. Witness how it works:


     use threads;
     my $var = 1;
     threads->create(sub { $var++ })->join();
     print $var;

If you are accustomed to threading in most other languages, (Java/C) you would expect $var to contain a 2 and the result of this script to be "2". However since Perl does not share data between threads, $var is copied in the thread and only incremented in the thread. The original value in the main thread is not changed, so the output is "1".

However if we add in threads::shared and a : shared attribute we get the desired result:


     use threads;
     use threads::shared;
     my $var : shared = 1;
     threads->create(sub { $var++ })->join();
     print $var

Now the result will be "2", since we declared $var to be a shared variable. Perl will then act on the same variable and provide automatic locking to keep the variable out of trouble.

This makes it quite a bit simpler for us module developers to make sure our modules are thread-safe. Essentially, all pure Perl modules are thread-safe because any global state data, which is usually what gives you thread-safety problems, is by default local to each thread.

Definition of thread-safe levels

To define what we mean by thread-safety, here are some terms adapted from the Solaris thread-safety levels.

thread-safe
This module can safely be used from multiple threads. The effect of calling into a safe module is that the results are valid even when called by multiple threads. However, thread-safe modules can still have global consequences; for example, sending or reading data from a socket affects all threads that are working with that socket. The application has the responsibility to act sane with regards to threads. If one thread creates a file with the name file.tmp then another file which tries to create it will fail; this is not the fault of the module.
thread-friendly
Thread-friendly modules are thread-safe modules that know about and provide special functions for working with threads or utilize threads by themselves. A typical example of this is the core threads::queue module. One could also imagine a thread-friendly module with a cache to declare that cache to be shared between threads to make hits more likely and save memory.
thread-unsafe
This module can not safely be used from different threads; it is up to the application to synchronize access to the library and make sure it works with it the way it is specified. Typical examples here are XS modules that utilize external unsafe libraries that might only allow one thread to execute them.

Related Reading

Perl in a Nutshell, 2nd Edition

Perl in a Nutshell, 2nd Edition
By Stephen Spainhour, Ellen Siever, Nate Patwardhan

Since Perl only shares when asked to, most pure Perl code probably falls into the thread-safe category, that doesn't mean you should trust it until you have review the source code or they have been marked with thread-safe by the author. Typical problems include using alarm(), mucking around with signals, working with relative paths and depending on %ENV. However remember that ALL XS modules that don't state anything fall into the definitive thread-unsafe category.

Why should I bother making my module thread-safe or thread-friendly?

Well, it usually isn't much work and it will make the users of this modules that want to use it in a threaded environment very happy. What? Threaded Perl environments aren't that common you say? Wait until Apache 2.0 and mod_perl 2.0 becomes available. One big change is that Apache 2.0 can run in threaded mode and then mod_perl will have to be run in threaded mode; this can be a huge performance gain on some operating systems. So if you want your modules to work with mod_perl 2.0, taking a look at thread-safety levels is a good thing to do.

So what do I do to make my module thread-friendly?

A good example of a module that needed a little modification to work with threads is Michael Schwern's most excellent Test::Simple suite (Test::Simple, Test::More and Test::Builder). Surprisingly, we had to change very little to fix it.

The problem was simply that the test numbering was not shared between threads.

For example


     use threads;
     use Test::Simple tests => 3;
     ok(1);
     threads->create(sub { ok(1) })->join();
     ok(1);

Now that will return


     1..3
     ok 1
     ok 2
     ok 2

Does it look similar to the problem we had earlier? Indeed it does, seems like somewhere there is a variable that needs to shared.

Now reading the documentation of Test::Simple we find out that all magic is really done inside Test::Builder, opening up Builder.pm we quickly find the following lines of code:


     my @Test_Results = ();
     my @Test_Details = ();
     my $Curr_Test = 0;

Now we would be tempted to add use threads::shared and :shared attribute.


     use threads::shared;
     my @Test_Results : shared = ();
     my @Test_Details : shared = ();
     my $Curr_Test : shared = 0;

However Test::Builder needs to work back to Perl 5.4.4! Attributes were only added in 5.6.0 and the above code would be a syntax error in earlier Perls. And even if someone were using 5.6.0, threads::shared would not be available for them.

The solution is to use the runtime function share() exported by threads::shared, but we only want to do it for 5.8.0 and when threads have been enabled. So, let's wrap it in a BEGIN block and an if.


     BEGIN{
         if($] >= 5.008 && exists($INC{'threads.pm'})) {
             require threads::shared;
             import threads::shared qw(share);
             share($Curr_Test);
             share(@Test_Details)
             share(@Test_Results);
         }

So, if 5.8.0 or higher and threads has been loaded, we do the runtime equivalent of use threads::shared qw(share); and call share() on the variables we want to be shared.

Now lets find out some examples of where $Curr_Test is used. We find sub ok {} in Test::Builder; I won't include it here, but only a smaller version which contains:


     sub ok {
         my($self, $test, $name) = @_;
         $Curr_Test++;
         $Test_Results[$Curr_Test-1] = 1 unless($test);
     }

Now, this looks like it should work right? We have shared $Curr_Test and @Test_Results. Of course, things aren't that easy; they never are. Even if the variables are shared, two threads could enter ok() at the same time. Remember that not even the statement $CurrTest++ is an atomic operation, it is just a shortcut for writing $CurrTest = $CurrTest + 1. So let's say two threads do that at the same time.


     Thread 1: add 1 + $Curr_Test
     Thread 2: add 1 + $Curr_Test
     Thread 2: Assign result to $Curr_Test
     Thread 1: Assign result to $Curr_Test

The effect would be that $Curr_Test would only be increased by one, not two! Remember that a switch between two threads could happen at ANY time, and if you are on a multiple CPU machine they can run at exactly the same time! Never trust thread inertia.

So how do we solve it? We use the lock() keyword. lock() takes a shared variable and locks it for the rest of the scope, but it is only an advisory lock so we need to find every place that $Curr_Test is used and modified and it is expected not to change. The ok() becomes:


     sub ok {
         my($self, $test, $name) = @_;
         lock($Curr_Test);
         $Curr_Test++;
         $Test_Results[$Curr_Test-1] = 1 unless($test);
     }

So are we ready? Well, lock() was only added in Perl 5.5 so we need to add an else to the BEGIN clause to define a lock function if we aren't running with threads. The end result would be.


     my @Test_Results = ();
     my @Test_Details = ();
     my $Curr_Test = 0;
     BEGIN{
         if($] >= 5.008 && exists($INC{'threads.pm'})) {
             require threads::shared;
             import threads::shared qw(share);
             share($Curr_Test);
             share(@Test_Details)
             share(@Test_Results);
         } else {
             *lock = sub(*) {};
         }
     }
     sub ok {
         my($self, $test, $name) = @_;
         lock($Curr_Test);
         $Curr_Test++;
         $Test_Results[$Curr_Test-1] = 1 unless($test);
     }

In fact, this is very like the code that has been added to Test::Builder to make it work nice with threads. The only thing not correct is ok() as I cut it down to what was relevant. There were roughly 5 places where lock() had to be added. Now the test code would print


     1..3
     ok 1
     ok 2
     ok 3

which is exactly what the end user would expect. All in all this is a rather small change for this 1291 line module, we change roughly 15 lines in a non intrusive way, the documentation and testcase code makes up most of the patch. The full patch is at http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-06/msg00816.html

Pages: 1, 2

Next Pagearrow





Contact Us | Advertise with Us | Privacy Policy | Press Center | Jobs | Submissions Guidelines

Copyright © 2000-2008 O’Reilly Media, Inc. All Rights Reserved. | (707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on the O'Reilly Network are the property of their respective owners.

For problems or assistance with this site, email