Parsing iCal Data
by Robert Pratte
|
Pages: 1, 2, 3, 4
There are several possible approaches to parsing the above data in Perl, but perhaps the easiest one is to create a hash of events, modeled after the iCalendar structure. With this approach, a single calendar becomes a hash of hashes with a key:value pair for each event, where the key is the event ID and the value is a hash containing the event data. While it would be just as easy to store the data as an array of hashes, the ability to pull an event by its ID allows greater flexibility and power to manipulate the data. The data for a single event might look like this:
Calendar->EventUID = { 'UID' => EventUID,
'LOCATION' => EventLocation,
'START' => EventStart,
'END' => EventEnd,
'DURATION' => EventDuration,
'DTSTAMP' => EventDatestamp,
'SEQUENCE' => EventSequence,
'SUMMARY' => EventSummary,
'DESCRIPTION' => EventDescription,
'URL' => EventURL };
Note that these keys represent only a subset of all possibilities as defined
in RFC 2246. Each event may not contain all of the above keys. For example, the
first event in my example does not contain DURATION. Further,
certain keys (such as SEQUENCE) may be irrelevant for your
purposes.
With the data structure designed, what's the right way to convert iCalendar
data into such a structure? Realizing the mantra of Perl, that there is more
than one way to do things, perhaps the easiest approach is to match key names,
starting a new event block when the parser sees BEGIN:VEVENT and
ending it when END:VEVENT appears. Given the large number of
possible keys, it may be easiest to use switch-like behavior. Here is an
example of how to do this, splitting a key:value on the colon character (as the
semicolon precedes any modifiers to the data):
SWITCH: {
if ( $_ =~ /BEGIN:VEVENT/ ) {
##-----------------------------------------
## We have a new event, so start fresh.
##-----------------------------------------
$eventHash = {};
last SWITCH; }
if ( $_ =~ /END:VEVENT/ ) {
##-----------------------------------------
## We hit the event end, so store it.
##-----------------------------------------
$calHash->{$eventHash->{'UID'}} =
{
'UID' => $eventHash->{'UID'},
'LOCATION' => $eventHash->{'LOCATION'},
#...The rest of our keys...
'URL' => $eventHash->{'URL'}
};
last SWITCH; }
## we will split the key:value pair into an array
and grab the value (1st element)
if ( $_ =~ /^UID/ ) {
$eventHash->{'UID'} = ( split ( /:/, $_ ) )[1];
last SWITCH; }
if ( $_ =~ /^LOCATION/ ) {
$eventHash->{'LOCATION'} = ( split ( /:/, $_ ) )[1];
last SWITCH; }
...The rest of our key matches...
if ( $_ =~ /^URL/ ) {
$eventHash->{'DESCRIPTION'} = ( split ( /:/, $_ ) )[1];
last SWITCH; }
} # end switch
While this example does a good job of showing how to fill the data structure, it does a poor job of leveraging the power of Perl. More extensive use of regular expressions, the use of one of the Parse modules in CPAN, or even a bit of recursive programming could make this code more elegant and perhaps even a bit faster. However, these tactics may also make the code a bit harder to read--which is not always bad, unless you are attempting to explain concepts in an article. For further ideas, Toedor Zlatanov has written an article on using Perl parsing modules as well as a real mind-bender on using a functional programming approach in Perl.
The Dot Specification
Dot (PDF) is a diagramming, or directed, graph language created by Emden Gansner, Eleftherios Koutsofios, and Stephen North at Bell Labs. There are several implementations of Dot, including GraphViz, WebDot, and Grappa. Interestingly, OmniGraffle, a powerful diagramming tool for Macintosh computers, can read simple Dot files.
Creating Dot Files
The basic syntax of Dot is that there are objects or things that you
describe by adding data within digraph {} braces. You denote
relationships between objects with the -> combination of
characters. With this code:
digraph my_first_graph {
object1 -> object2;
}
your Dot-driven application (such as GraphViz) will display an image something like Figure 3.

Figure 3. A simple graph

