Where Wizards Fear To Tread
Perl 5.8 Threads
by Artur BergmanJune 11, 2002
One of the big new features in perl 5.8 is that we now have real working threads available to us through the threads pragma.
However, for us module authors who already have to support our modules on different versions of perl and different platforms, we now have to deal with another case: threads! This article will show you how threads relate to modules, how we can take old modules and make them thread-safe, and round off with a new module that alters perl's behavior of the "current working directory".
To run the examples I have shown here, you need perl 5.8 RC1 or later
compiled with threads. On Unix, you can use
Configure -Duseithreads -Dusethreads; On Win32, the default build will
always have threading enabled.
How do threads relate to modules?
Threading in Perl is based on the notion of explicit shared data. That
is, only data that is explicitly requested to be shared will be shared
between threads. This is controlled by the threads::shared pragma and
the ": shared" attribute. Witness how it works:
use threads;
my $var = 1;
threads->create(sub { $var++ })->join();
print $var;
If you are accustomed to threading in most other languages, (Java/C) you would expect $var to contain a 2 and the result of this script to be "2". However since Perl does not share data between threads, $var is copied in the thread and only incremented in the thread. The original value in the main thread is not changed, so the output is "1".
However if we add in threads::shared and a : shared attribute we get
the desired result:
use threads;
use threads::shared;
my $var : shared = 1;
threads->create(sub { $var++ })->join();
print $var
Now the result will be "2", since we declared $var to be a shared variable. Perl will then act on the same variable and provide automatic locking to keep the variable out of trouble.
This makes it quite a bit simpler for us module developers to make sure our modules are thread-safe. Essentially, all pure Perl modules are thread-safe because any global state data, which is usually what gives you thread-safety problems, is by default local to each thread.
Definition of thread-safe levels
To define what we mean by thread-safety, here are some terms adapted from the Solaris thread-safety levels.
- thread-safe
- This module can safely be used from multiple threads. The effect of calling into a safe module is that the results are valid even when called by multiple threads. However, thread-safe modules can still have global consequences; for example, sending or reading data from a socket affects all threads that are working with that socket. The application has the responsibility to act sane with regards to threads. If one thread creates a file with the name file.tmp then another file which tries to create it will fail; this is not the fault of the module.
- thread-friendly
-
Thread-friendly modules are thread-safe modules that know about and
provide special functions for working with threads or utilize threads by
themselves. A typical example of this is the core
threads::queuemodule. One could also imagine a thread-friendly module with a cache to declare that cache to be shared between threads to make hits more likely and save memory. - thread-unsafe
- This module can not safely be used from different threads; it is up to the application to synchronize access to the library and make sure it works with it the way it is specified. Typical examples here are XS modules that utilize external unsafe libraries that might only allow one thread to execute them.
|
Related Reading
Perl in a Nutshell, 2nd Edition |
Since Perl only shares when asked to, most pure Perl code probably falls
into the thread-safe category, that doesn't mean you should trust it
until you have review the source code or they have been marked with
thread-safe by the author. Typical problems include using alarm(),
mucking around with signals, working with relative paths and depending
on %ENV. However remember that ALL XS modules that don't state
anything fall into the definitive thread-unsafe category.
Why should I bother making my module thread-safe or thread-friendly?
Well, it usually isn't much work and it will make the users of this modules that want to use it in a threaded environment very happy. What? Threaded Perl environments aren't that common you say? Wait until Apache 2.0 and mod_perl 2.0 becomes available. One big change is that Apache 2.0 can run in threaded mode and then mod_perl will have to be run in threaded mode; this can be a huge performance gain on some operating systems. So if you want your modules to work with mod_perl 2.0, taking a look at thread-safety levels is a good thing to do.
So what do I do to make my module thread-friendly?
A good example of a module that needed a little modification to work
with threads is Michael Schwern's most excellent Test::Simple suite
(Test::Simple, Test::More and Test::Builder). Surprisingly, we
had to change very little to fix it.
The problem was simply that the test numbering was not shared between threads.
For example
use threads;
use Test::Simple tests => 3;
ok(1);
threads->create(sub { ok(1) })->join();
ok(1);
Now that will return
1..3
ok 1
ok 2
ok 2
Does it look similar to the problem we had earlier? Indeed it does, seems like somewhere there is a variable that needs to shared.
Now reading the documentation of Test::Simple we find out that all magic
is really done inside Test::Builder, opening up Builder.pm we quickly
find the following lines of code:
my @Test_Results = ();
my @Test_Details = ();
my $Curr_Test = 0;
Now we would be tempted to add use threads::shared and :shared
attribute.
use threads::shared;
my @Test_Results : shared = ();
my @Test_Details : shared = ();
my $Curr_Test : shared = 0;
However Test::Builder needs to work back to Perl 5.4.4! Attributes
were only added in 5.6.0 and the above code would be a syntax error in
earlier Perls. And even if someone were using 5.6.0, threads::shared
would not be available for them.
The solution is to use the runtime function share() exported by
threads::shared, but we only want to do it for 5.8.0 and when threads
have been enabled. So, let's wrap it in a BEGIN block and an if.
BEGIN{
if($] >= 5.008 && exists($INC{'threads.pm'})) {
require threads::shared;
import threads::shared qw(share);
share($Curr_Test);
share(@Test_Details)
share(@Test_Results);
}
So, if 5.8.0 or higher and threads has been loaded, we do the runtime
equivalent of use threads::shared qw(share); and call share() on
the variables we want to be shared.
Now lets find out some examples of where $Curr_Test is used. We find
sub ok {} in Test::Builder; I won't include it here, but only a
smaller version which contains:
sub ok {
my($self, $test, $name) = @_;
$Curr_Test++;
$Test_Results[$Curr_Test-1] = 1 unless($test);
}
Now, this looks like it should work right? We have shared $Curr_Test
and @Test_Results. Of course, things aren't that easy; they never are.
Even if the variables are shared, two threads could enter ok() at the
same time. Remember that not even the statement $CurrTest++ is an
atomic operation, it is just a shortcut for writing
$CurrTest = $CurrTest + 1. So let's say two threads do that at the same time.
Thread 1: add 1 + $Curr_Test
Thread 2: add 1 + $Curr_Test
Thread 2: Assign result to $Curr_Test
Thread 1: Assign result to $Curr_Test
The effect would be that $Curr_Test would only be increased by one, not two! Remember that a switch between two threads could happen at ANY time, and if you are on a multiple CPU machine they can run at exactly the same time! Never trust thread inertia.
So how do we solve it? We use the lock() keyword. lock() takes a shared
variable and locks it for the rest of the scope, but it is only an
advisory lock so we need to find every place that $Curr_Test is used and
modified and it is expected not to change. The ok() becomes:
sub ok {
my($self, $test, $name) = @_;
lock($Curr_Test);
$Curr_Test++;
$Test_Results[$Curr_Test-1] = 1 unless($test);
}
So are we ready? Well, lock() was only added in Perl 5.5 so we need to
add an else to the BEGIN clause to define a lock function if we aren't
running with threads. The end result would be.
my @Test_Results = ();
my @Test_Details = ();
my $Curr_Test = 0;
BEGIN{
if($] >= 5.008 && exists($INC{'threads.pm'})) {
require threads::shared;
import threads::shared qw(share);
share($Curr_Test);
share(@Test_Details)
share(@Test_Results);
} else {
*lock = sub(*) {};
}
}
sub ok {
my($self, $test, $name) = @_;
lock($Curr_Test);
$Curr_Test++;
$Test_Results[$Curr_Test-1] = 1 unless($test);
}
In fact, this is very like the code that has been added to Test::Builder
to make it work nice with threads. The only thing not correct is ok() as
I cut it down to what was relevant. There were roughly 5 places where
lock() had to be added. Now the test code would print
1..3
ok 1
ok 2
ok 3
which is exactly what the end user would expect. All in all this is a rather small change for this 1291 line module, we change roughly 15 lines in a non intrusive way, the documentation and testcase code makes up most of the patch. The full patch is at http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-06/msg00816.html
Pages: 1, 2 |


