Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Beginner's Introduction to Perl 5.10, Part 2
by chromatic, Doug Sheppard | Pages: 1, 2, 3

More fun with strings

You'll often want to manipulate strings: Break them into smaller pieces, put them together and change their contents. Perl offers three functions that make string manipulation easy and fun: substr(), split(), and join().

If you want to retrieve part of a string (say, the first four characters or a 10-character chunk from the middle), use the substr() function. It takes either two or three parameters: the string you want to look at, the character position to start at (the first character is position 0) and the number of characters to retrieve. If you leave out the number of characters, you'll retrieve everything up to the end of the string.

my $greeting = "Welcome to Perl!\n";

print substr($greeting, 0, 7);     # "Welcome"
print substr($greeting, 7);        # " to Perl!\n"

A neat and often-overlooked thing about substr() is that you can use a negative character position. This will retrieve a substring that begins with many characters from the end of the string.

my $greeting = "Welcome to Perl!\n";

print substr($greeting, -6, 4);      # "Perl"

(Remember that inside double quotes, \n represents the single new-line character.)

You can also manipulate the string by using substr() to assign a new value to part of it. One useful trick is using a length of zero to insert characters into a string:

my $greeting = "Welcome to Java!\n";

substr($greeting, 11, 4) = 'Perl';    # $greeting is now "Welcome to Perl!\n";
substr($greeting, 7, 3)  = '';        #       ... "Welcome Perl!\n";
substr($greeting, 0, 0)  = 'Hello. '; #       ... "Hello. Welcome Perl!\n";

split() breaks apart a string and returns a list of the pieces. split() generally takes two parameters: a regular expression to split the string with and the string you want to split. (The next article will discuss regular expressions in more detail; for the moment, all you need to know is that this regular expression represents a single space character: / /.) The characters you split won't show up in any of the list elements.

my $greeting = "Hello. Welcome Perl!\n";
my @words    = split(/ /, $greeting);   # Three items: "Hello.", "Welcome", "Perl!\n"

You can also specify a third parameter: the maximum number of items to put in your list. The splitting will stop as soon as your list contains that many items:

my $greeting = "Hello. Welcome Perl!\n";
my @words    = split(/ /, $greeting, 2);   # Two items: "Hello.", "Welcome Perl!\n";

Of course, what you can split, you can also join(). The join() function takes a list of strings and attaches them together with a specified string between each element, which may be an empty string:

my @words         = ("Hello.", "Welcome", "Perl!\n");
my $greeting      = join(' ', @words);       # "Hello. Welcome Perl!\n";
my $andy_greeting = join(' and ', @words);   # "Hello. and Welcome and Perl!\n";
my $jam_greeting  = join('', @words);        # "Hello.WelcomePerl!\n";

Filehandles

That's enough about strings. It's time to consider files -- after all, what good is string manipulation if you can't do it where it counts?

To read from or write to a file, you have to open it. When you open a file, Perl asks the operating system if the file is accessible -- does the file exist if you're trying to read it (or can it be created if you're trying to create a new file), and do you have the necessary file permissions to do what you want? If you're allowed to use the file, the operating system will prepare it for you, and Perl will give you a filehandle.

Ask Perl to create a filehandle for you by using the open() function, which takes two or three arguments: the filehandle you want to create, the mode of the file, and the file you want to work with. First, we'll concentrate on reading files. The following statement opens the file log.txt using the filehandle $logfile:

open my $logfile, 'log.txt';

Opening a file involves several behind-the-scenes tasks that Perl and the operating system undertake together, such as checking that the file you want to open actually exists (or creating it if you're trying to create a new file) and making sure you're allowed to manipulate the file (do you have the necessary file permissions, for instance). Perl will do all of this for you, so in general you don't need to worry about it.

Once you've opened a file to read, you can retrieve lines from it by using the <> construct, also known as readline. Inside the angle brackets, place your filehandle. What you get from this depends on what you want to get: in a scalar context (a more technical way of saying "if you're assigning it to a scalar"), you retrieve the next line from the file, but if you're looking for a list, you get a list of all the remaining lines in the file.

You can, of course, close a filehandle that you've opened. You don't always have to do this, because Perl is clever enough to close a filehandle when your program ends, when you try to reuse an existing filehandle, or when the lexical variable containing the filehandle goes out of scope.

Here's a simple program that will display the contents of the file log.txt, and assumes that the first line of the file is its title:

open my $logfile, 'log.txt' or die "I couldn't get at log.txt: $!";

my $title = <$logfile>;
print "Report Title: $title";

print while <$logfile>;
close $logfile;

That code may seem pretty dense, but it combines ideas you've seen before. The while operator loops over every line of the file, one line at a time, putting each line into the Perl pronoun $_. (A pronoun? Yes -- think of it as it.) For each line read, Perl prints the line. Now the pronoun should make sense. While you read it from the file, print it.

Why not use say? Each line in the file ends with a newline -- that's how Perl knows that it's a line. There's no need to add an additional newline, so say would double-space the output.

Writing files

You also use open() when you are writing to a file. There are two ways to open a file for writing: overwrite and append. When you open a file in overwrite mode, you erase whatever it previously contained. In append mode, you attach your new data to the end of the existing file without erasing anything that was already there.

To indicate that you want a filehandle for writing, use a single > character as the mode passed to open. This opens the file in overwrite mode. To open it in append mode, use two > characters.

open my $overwrite, '>', 'overwrite.txt' or die "error trying to overwrite: $!";
# Wave goodbye to the original contents.

open my $append, '>>', 'append.txt' or die "error trying to append: $!";
# Original contents still there; add to the end of the file

Once your filehandle is open, use the humble print or say operator to write to it. Specify the filehandle you want to write to and a list of values you want to write:

use 5.010;

say $overwrite 'This is the new content';
print $append "We're adding to the end here.\n", "And here too.\n";

Live free or die!

Most of these open() statements include or die "some sort of message". This is because we live in an imperfect world, where programs don't always behave exactly the way we want them to. It's always possible for an open() call to fail; maybe you're trying to write to a file that you're not allowed to write, or you're trying to read from a file that doesn't exist. In Perl, you can guard against these problems by using or and and.

A series of statements separated by or will continue until you hit one that works, or returns a true value. This line of code will either succeed at opening $output in overwrite mode, or cause Perl to quit:

open my $output, '>', $outfile or die "Can't write to '$outfile': $!";

The die statement ends your program with an error message. The special variable $! contains Perl's explanation of the error. In this case, you might see something like this if you're not allowed to write to the file. Note that you get both the actual error message ("Permission denied") and the line where it happened:

Can't write to 'a2-die.txt': Permission denied at ./a2-die.pl line 1.

Defensive programming like this is useful for making your programs more error-resistant -- you don't want to write to a file that you haven't successfully opened! (Putting single-quotes around the filename may help you see any unexpected whitespace in the filename. You'll slap your forehead when it happens to you.)

Here's an example: As part of your job, you write a program that records its results in a file called vitalreport.txt. You use the following code:

open my $vital, '>', 'vitalreport.txt';

If this open() call fails (for instance, vitalreport.txt is owned by another user who hasn't given you write permission), you'll never know it until someone looks at the file afterward and wonders why the vital report wasn't written. (Just imagine the joy if that "someone" is your boss, the day before your annual performance review.) When you use or die, you avoid all this:

open my $vital, '>', 'vitalreport.txt' or die "Can't write vital report: $!";

Instead of wondering whether your program wrote your vital report, you'll immediately have an error message that both tells you what went wrong and on what line of your program the error occurred.

You can use or for more than just testing file operations:

use 5.010;
($pie eq 'apple') or ($pie eq 'cherry') or ($pie eq 'blueberry')
        or say 'But I wanted apple, cherry, or blueberry!';

In this sequence, if you have an appropriate pie, Perl skips the rest of the chain. Once one statement works, the rest are ignored. The and operator does the opposite: It evaluates your chain of statements, but stops when one of them doesn't work.

open my $log, 'log.file' and say 'Logfile is open!';
say 'Logfile is open!' if open my $log, 'log.file';

This statement will only show you the words Logfile is open! if the open() succeeds -- do you see why?

Again, just because there's more than one way to execute code conditionally doesn't mean you have to use every way in a single program or the most clever or creative way. You have plenty of options. Consider using the most readable one for the situation.

Pages: 1, 2, 3

Next Pagearrow