Advanced Subroutine Techniques
by Rob Kinyon
|
Pages: 1, 2
Validation
Argument validation is more difficult in Perl than in other languages. In C or Java, for instance, every variable has a type associated with it. This includes subroutine declarations, meaning that trying to pass the wrong type of variable to a subroutine gives a compile-time error. By contrast, because perl flattens everything to a single list, there is no compile-time checking at all. (Well, there kinda is with prototypes.)
This has been such a problem that there are dozens of modules on CPAN to address the problem. The most commonly recommended one is Params::Validate.
Prototypes
Prototypes in Perl are a way of letting Perl know exactly what to expect for a given subroutine, at compile time. If you've ever tried to pass an array to the vec() built-in and you saw Not enough arguments for vec, you've hit a prototype.
For the most part, prototypes are more trouble than they're worth. For one thing, Perl doesn't check prototypes for methods because that would require the ability to determine, at compile time, which class will handle the method. Because you can alter @ISA at runtime--you see the problem. The main reason, however, is that prototypes aren't very smart. If you specify sub foo ($$$), you cannot pass it an array of three scalars (this is the problem with vec()). Instead, you have to say foo( $x[0], $x[1], $x[2] ), and that's just a pain.
Prototypes can be very useful for one reason--the ability to pass subroutines in as the first argument. Test::Exception uses this to excellent advantage:
sub do_this_to (&;$) {
my ($action, $name) = @_;
$action->( $name );
}
do_this_to { print "Hello, $_[0]\n" } 'World';
do_this_to { print "Goodbye, $_[0]\n" } 'cruel world!';
Context Awareness
Using the wantarray built-in, a subroutine can determine its calling context. Context for subroutines, in Perl, is one of three things--list, scalar, or void. List context means that the return value will be used as a list, scalar context means that the return value will be used as a scalar, and void context means that the return value won't be used at all.
sub check_context {
# True
if ( wantarray ) {
print "List context\n";
}
# False, but defined
elsif ( defined wantarray ) {
print "Scalar context\n";
}
# False and undefined
else {
print "Void context\n";
}
}
my @x = check_context(); # prints 'List context'
my %x = check_context(); # prints 'List context'
my ($x, $y) = check_context(); # prints 'List context'
my $x = check_context(); # prints 'Scalar context'
check_context(); # prints 'Void context'
For CPAN modules that implement or augment context awareness, look at Contextual::Return, Sub::Context, and Return::Value.
Note: you can misuse context awareness heavily by having the subroutine do something completely different when called in scalar versus list context. Don't do that. A subroutine should be a single, easily identifiable unit of work. Not everyone understands all of the different permutations of context, including your standard Perl expert.
Instead, I recommend having a standard return value, except in void context. If your return value is expensive to calculate and is calculated only for the purposes of returning it, then knowing if you're in void context may be very helpful. This can be a premature optimization, however, so always measure (benchmarking and profiling) before and after to make sure you're optimizing what needs optimizing.
Mimicking Perl's Internal Functions
A lot of Perl's internal functions modify their arguments and/or use $_ or @_ as a default if no parameters are provided. A perfect example of this is chomp(). Here's a version of chomp() that illustrates some of these techniques:
sub my_chomp {
# This is a special case in the chomp documentation
return if ref($/);
# If a return value is expected ...
if ( defined wantarray ) {
my $count = 0;
$count += (@_ ? (s!$/!!g for @_) : s!$/!!g);
return $count;
}
# Otherwise, don't bother counting
else {
@_ ? do{ s!$/!!g for @_ } : s!$/!!g;
return;
}
}
- Use
return;instead ofreturn undef;if you want to return nothing. If someone assigns the return value to an array, the latter creates an array of one value (undef), which evaluates to true. The former will correctly handle all contexts. - If you want to modify
$_if no parameters are given, you have to check@_explicitly. You cannot do something like@_ = ($_) unless @_;because$_will lose its magic. - This doesn't calculate
$countunless$countis useful (using a check for void context). - The key is the aliasing of
@_. If you modify@_directly (as opposed to assigning the values in@_to variables), then you modify the actual parameters passed in.
Conclusion
I hope I have introduced you to a few more tools in your toolbox. The art of writing a good subroutine is very complex. Each of the techniques I have presented is one tool in the programmer's toolbox. Just as a master woodworker wouldn't use a drill for every project, a master programmer doesn't make every subroutine use named arguments or mimic a built-in. You must evaluate each technique every time to see if it will make the code more maintainable. Overusing these techniques will make your code less maintainable. Using them appropriately will make your life easier.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 5 of 5.
- Tiny problem with my_chomp....
2007-05-10 18:13:14 YuvalYaari [Reply]
(Sorry for being nitpicky... :))
chomp would only remove $/ once every time:
$ perl -le '$/="a";$_ = "aaa"; print while chomp'
aaa
aa
a
And would only match at the *end* of the string:
$ perl -le '$/="a";$_ = "aaaaaab";chomp;print'
aaaaaab
(Whereas my_chomp would "chomp" it into "b").
So I guess the regex should be s!$/\Z!!
Also my `perl`s (5.8.8 & 5.9.5) didn't like:
$count += (@_ ? (s!$/!!g for @_) : s!$/!!g);
Using map made perl happy again:
$count += (@_ ? (map {s!$/\Z!!} @_) : s!$/\Z!!g);
- Best Practices says not to use prototypes
2006-02-26 13:27:54 the_newt [Reply]
Chromatic, Damien's book "Perl Best Practices" says not to use prototypes ever for subroutines if I remember correctly. I don't have the book in front of me. I understand that everyone has their own ideas and everyone has their own way of doing things in Perl. After all TMTOWTDI right? Why do you advocate the use use of prototypes? Besides the obvious that it's used for making sure you put in the correct amount of parameters for that subroutine.- Best Practices says not to use prototypes
2006-02-26 13:59:35 chromatic1 [Reply]
The only place I use them is to pass blocks as anonymous subroutines. That's consistent with the use Rob shows in this article. (I believe Damian said something similar about making your own code look like Perl's builtins, such as map and grep.)
- Best Practices says not to use prototypes
- bug in sample perl code
2006-02-24 13:06:39 DavidDyck [Reply]
The code in the text does not compile, as the
prototype indicates one argument, but the example
uses 2 arguments
sub do_this_to (&) {
my ($action, $name) = @_;
$action->( $name );
}
do_this_to { print "Hello, $_[0]\n" } 'World';
do_this_to { print "Goodbye, $_[0]\n" } 'cruel world!';
There errors I get are:
Too many arguments for main::do_this_to at arg.pl line 8, near "'World';"
Too many arguments for main::do_this_to at arg.pl line 9, near "'cruel world!';"
Change the argument as follows to get it to compile.
2c2
< sub do_this_to (&) {
---
> sub do_this_to (&;$) {
- bug in sample perl code
2006-02-24 13:10:51 chromatic1 [Reply]
Thanks, good catch. This was a formatting error and I've now corrected it.
- bug in sample perl code



