Web Basics with LWP
by Sean M. Burke
|
Pages: 1, 2, 3, 4, 5
Accessing HTTPS URLs
When you access an HTTPS URL, it'll work for you just like an HTTP URL would--if your LWP installation has HTTPS support (via an appropriate Secure Sockets Layer library). For example:
use LWP 5.64;
my $url = 'https://www.paypal.com/'; # Yes, HTTPS!
my $browser = LWP::UserAgent->new;
my $response = $browser->get($url);
die "Error at $url\n ", $response->status_line, "\n Aborting"
unless $response->is_success;
print "Whee, it worked! I got that ",
$response->content_type, " document!\n";
If your LWP installation doesn't have HTTPS support set up, then the response will be unsuccessful, and you'll get this error message:
Error at https://www.paypal.com/
501 Protocol scheme 'https' is not supported
Aborting at paypal.pl line 7. [or whatever program and line]
If your LWP installation does have HTTPS support installed, then the
response should be successful, and you should be able to consult
$response just like with any normal HTTP response.
For information about installing HTTPS support for your LWP installation, see the helpful README.SSL file that comes in the libwww-perl distribution.
Getting Large Documents
When you're requesting a large (or at least potentially large) document,
a problem with the normal way of using the request methods (like
$response = $browser->get($url)) is that the response object in
memory will have to hold the whole document--in memory. If the
response is a 30-megabyte file, this is likely to be quite an
imposition on this process's memory usage.
A notable alternative is to have LWP save the content to a file on disk, instead of saving it up in memory. This is the syntax to use:
$response = $ua->get($url,
':content_file' => $filespec,
);
For example,
$response = $ua->get('http://search.cpan.org/',
':content_file' => '/tmp/sco.html'
);
When you use this :content_file option, the $response will have
all the normal header lines, but $response->content will be
empty.
Note that this ":content_file" option isn't supported under older
versions of LWP, so you should consider adding use LWP 5.66; to check
the LWP version, if you think your program might run on systems with
older versions.
If you need to be compatible with older LWP versions, then use this syntax, which does the same thing:
use HTTP::Request::Common;
$response = $ua->request( GET($url), $filespec );
Resources
|
Related Reading
|
Remember, this article is just the most rudimentary introduction to LWP--to learn more about LWP and LWP-related tasks, you really must read from the following:
LWP::Simple: Simple functions for getting, heading, and mirroring URLs.LWP: Overview of the libwww-perl modules.LWP::UserAgent: The class for objects that represent "virtual browsers."HTTP::Response: The class for objects that represent the response to a LWP response, as in$response = $browser->get(...).HTTP::MessageandHTTP::Headers: Classes that provide more methods toHTTP::Response.URI: Class for objects that represent absolute or relative URLs.URI::Escape: Functions for URL-escaping and URL-unescaping strings (like turning "this & that" to and from "this%20%26%20that").HTML::Entities: Functions for HTML-escaping and HTML-unescaping strings (like turning "C. & E. Brontë" to and from "C. & E. Brontë").HTML::TokeParserandHTML::TreeBuilder: Classes for parsing HTML.HTML::LinkExtor: Class for finding links in HTML documents.And last but not least, my book Perl & LWP.
Copyright ©2002, Sean M. Burke. You can redistribute this document and/or modify it, but only under the same terms as Perl itself.






