Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Parsing iCal Data
by Robert Pratte | Pages: 1, 2, 3, 4

There are several possible approaches to parsing the above data in Perl, but perhaps the easiest one is to create a hash of events, modeled after the iCalendar structure. With this approach, a single calendar becomes a hash of hashes with a key:value pair for each event, where the key is the event ID and the value is a hash containing the event data. While it would be just as easy to store the data as an array of hashes, the ability to pull an event by its ID allows greater flexibility and power to manipulate the data. The data for a single event might look like this:

Calendar->EventUID = { 'UID'         => EventUID,
                       'LOCATION'    => EventLocation,
                       'START'       => EventStart,
                       'END'         => EventEnd,
                       'DURATION'    => EventDuration,
                       'DTSTAMP'     => EventDatestamp,
                       'SEQUENCE'    => EventSequence,
                       'SUMMARY'     => EventSummary,
                       'DESCRIPTION' => EventDescription,
                       'URL'         => EventURL };

Note that these keys represent only a subset of all possibilities as defined in RFC 2246. Each event may not contain all of the above keys. For example, the first event in my example does not contain DURATION. Further, certain keys (such as SEQUENCE) may be irrelevant for your purposes.

With the data structure designed, what's the right way to convert iCalendar data into such a structure? Realizing the mantra of Perl, that there is more than one way to do things, perhaps the easiest approach is to match key names, starting a new event block when the parser sees BEGIN:VEVENT and ending it when END:VEVENT appears. Given the large number of possible keys, it may be easiest to use switch-like behavior. Here is an example of how to do this, splitting a key:value on the colon character (as the semicolon precedes any modifiers to the data):

SWITCH: {
        if ( $_ =~ /BEGIN:VEVENT/ ) {
                ##-----------------------------------------
                ## We have a new event, so start fresh.
                ##-----------------------------------------
                $eventHash = {};
                last SWITCH; }


        if ( $_ =~ /END:VEVENT/ ) {
                ##-----------------------------------------
                ## We hit the event end, so store it.
                ##-----------------------------------------
                $calHash->{$eventHash->{'UID'}} = 
				{
					 'UID'         => $eventHash->{'UID'},
                     'LOCATION'    => $eventHash->{'LOCATION'},
                      #...The rest of our keys...
                     'URL'         => $eventHash->{'URL'} 
				};
                last SWITCH; }


          ## we will split the key:value pair into an array 
		     and grab the value (1st element)
        if ( $_ =~ /^UID/ ) {
                $eventHash->{'UID'} = ( split ( /:/, $_ ) )[1];
                last SWITCH; }


        if ( $_ =~ /^LOCATION/ ) {
                $eventHash->{'LOCATION'} = ( split ( /:/, $_ ) )[1];
                last SWITCH; }

...The rest of our key matches...

        if ( $_ =~ /^URL/ ) {
                $eventHash->{'DESCRIPTION'} = ( split ( /:/, $_ ) )[1];
                last SWITCH; }

} # end switch

While this example does a good job of showing how to fill the data structure, it does a poor job of leveraging the power of Perl. More extensive use of regular expressions, the use of one of the Parse modules in CPAN, or even a bit of recursive programming could make this code more elegant and perhaps even a bit faster. However, these tactics may also make the code a bit harder to read--which is not always bad, unless you are attempting to explain concepts in an article. For further ideas, Toedor Zlatanov has written an article on using Perl parsing modules as well as a real mind-bender on using a functional programming approach in Perl.

The Dot Specification

Dot (PDF) is a diagramming, or directed, graph language created by Emden Gansner, Eleftherios Koutsofios, and Stephen North at Bell Labs. There are several implementations of Dot, including GraphViz, WebDot, and Grappa. Interestingly, OmniGraffle, a powerful diagramming tool for Macintosh computers, can read simple Dot files.

Creating Dot Files

The basic syntax of Dot is that there are objects or things that you describe by adding data within digraph {} braces. You denote relationships between objects with the -> combination of characters. With this code:

digraph my_first_graph {
  object1 -> object2;
}

your Dot-driven application (such as GraphViz) will display an image something like Figure 3.

a simple graph
Figure 3. A simple graph

Pages: 1, 2, 3, 4

Next Pagearrow