Using Amazon S3 from Perl
by Abel Lin
|
Pages: 1, 2
Writing an Object
Writing an object is simply a matter of using the HTTP PUT method. Be aware that there is nothing to prevent you from overwriting an existing object; Amazon S3 will automatically update the object with the more recent write request. Also, it's currently not possible to append to or otherwise modify an object in place without replacing it.
my %headers = (
'Content-Type' => 'text/plain'
);
$response = $conn->put( $BUCKET, $KEY, S3Object->new("this is a test"),
\%headers);
Likewise, you can read a file from STDIN:
my %headers;
FILE: while(1) {
my $n = sysread(STDIN, $data, 1024 * 1024, length($data));
if ($n < 0) {
print STDERR "Error reading input: $!\n";
exit 1;
}
last FILE if $n == 0;
}
$response = $conn->put("$BUCKET", "$KEY", $data, \%headers);
To add custom metadata, simply add to the S3Object:
S3Object->new("this is a test", { name => "attribute" })
By default, every object has private access control when written. This allows only the user that stored the object to read it back. You can change these settings. Also, note that each object can hold a maximum of 5 GB of data.
You are probably wondering if it is also possible to upload via a standard HTTP POST. The folks at Amazon are working on it as we speak -- see HTTP POST beta discussion for more information. Until that's finished, you'll have to perform web-based uploads via an intermediate server.
Reading an Object
Like writing objects, there are several ways to read data from Amazon S3. One way is to generate a temporary URL to use with your favorite client (for example, wget or Curl) or even a browser to view or retrieve the object. All you have to do is generate the URL used to make the REST call.
my $generator = S3::QueryStringAuthGenerator->new($AWS_ACCESS_KEY_ID,
$AWS_SECRET_ACCESS_KEY);
...and then perform a simple HTTP GET request. This is a great trick if all you want to do is temporarily view or verify the data.
$generator->expires_in(60);
my $url = $generator->get($BUCKET, "$KEY");
print "$url \n";
You can also programmatically read the data directly from the initial connection. This is handy if you have to perform additional processing of the data.
my $response = $conn->get("$BUCKET", "$KEY");
my $data = $response->object->data;
Another cool feature is the ability to use BitTorrent to download files from Amazon S3 . You can access any object that has anonymous access privileges via BitTorrent.
Delete an Object
By now you probably have the hang of the process. If you're going to create objects, you're probably going to have to delete them at some point.
$conn->delete("$BUCKET", "$KEY");
Set Access Permissions and Publish to a Website
As you may have noticed from the previous examples, all Amazon S3 objects access goes through HTTP. This makes Amazon S3 particularly useful as a online repository. In particular, it's useful to manage and serve website media. You could almost imagine Amazon S3 serving as mini Content Delivery Network for media on your website. This example will demonstrate how to build a very simple online page where the images are served dynamically via Amazon S3.
The first thing to do us to upload some images and set the ACL permissions to public. I've modified the previous example with one difference. To make objects publicly readable, include the header x-amz-acl: public-read with the HTTP PUT request.
my %headers = (
'x-amz-acl' => 'public-read',
);
Additional ACL permissions include:
- private (default setting if left blank)
- public-read
- public-read-write
- authenticated-read
Now you know enough to put together a small script that will automatically display all images in the bucket to a web page (you'll probably want to spruce up the formatting).
...
my $BUCKET = "foobar";
my $response = $conn->list_bucket("$BUCKET");
for my $entry (@{$response->entries}) {
my $public_url = $generator->get($BUCKET, $entry->{Key});
my ($url, undef) = split (/\?/, $public_url);
$images .= "<img src=\"$url\"><br />";
}
($webpage = <<"WEBPAGE");
<html><body>$images</body></html>
WEBPAGE
print $q->header();
print $webpage;
To add images to this web page, upload more files into the bucket and they will automatically appear the next time you load the page.
It's also simple to link to media one at a time for a webpage. If you examine the HTML generated by this example, you'll see that all Amazon S3 URLs have the basic form http://bucketname.s3.amazon.com/objectname. Also note that the namespace for buckets is shared with all Amazon S3 users. You may have already picked up on this.
Conclusion
Amazon S3 is a great tool that can help with the data management needs of all sized organizations by offering cheap and unlimited storage. For personal use, it's a great tool for backups (also good for organizations) and general file storage. It's also a great tool for collaboration. Instead of emailing files around, just upload a file and set the proper access controls -- no more dealing with 10 MB attachment restrictions!
At SundayMorningRides.com we use S3 as part of our web serving infrastructure to reduce the load on our hardware when serving media content.
When combined with other Amazon Web Services such as SimpleDB (for structured data queries) and Elastic Compute Cloud (for data processing) it's easy to envision a low cost solution for web-scale computing and data management.
More Resources and References
- Amazon S3 Homepage
- Amazon Webservices Developer Connection
- Amazon S3 Library for REST in Perl
- Amazon Web Services Blog
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 4 of 4.
- Would security of data be important?
2008-04-10 14:20:38 sigzero [Reply]
I just don't have a good feeling about keeping data on a distributed network not under my control.- Would security of data be important?
2008-04-24 15:54:39 a1lin1 [Reply]
Here's an S3 Modules
2008-04-10 01:24:56 DaveCross [Reply]
Thanks Abel, interesting article.
You seem to be using a couple of modules that I can't find on CPAN (S3::AWSAuthConnection and S3::QueryStringAuthGenerator). If these are supplied by Amazon then it would be great if they could be uploaded to CPAN - as modules that aren't on CPAN tend to be seen as "second class citizens" by many Perl programmers.
There are already a couple of modules on CPAN for working with S3 (Net::Amazon::S3 and Amazon::S3). Have you tried these modules? Is there any advantage to using the modules you've used instead of these modules.
Cheers,
Dave...
- S3 Modules
2008-04-10 10:13:23 a1lin1 [Reply]
Hi Dave,
I'm not familiar with the modules that you mentioned, so I can't really comment on any advantages or disadvantages. I am personally a fan of REST which is why I used this module from the AWS Developer Website. It can be found here:
http://developer.amazonwebservices.com/connect/entry.jspa?categoryID=47&externalID=133
Having them uploaded to CPAN would be a great idea, I'll drop a note on their discussion board.
- Abel
- S3 Modules
- Would security of data be important?



