XML::Parser Module Enables XML Development in Perl

XML (Extensible Markup Language) is emerging as a core standard for Web development. Now a new Perl module (or extension) known as XML::Parser allows Perl programmers building applications to use XML, and provides an efficient, easy way to parse XML documents.

XML::Parser is built upon a C library, expat, that is very fast and robust. Expat was authored by James Clark, a highly respected leader in the SGML/XML community.

Perl, expat and XML::Parser are all Unicode-aware; that is, they read encoding declarations and perform necessary conversions into Unicode, a system for “the interchange, processing, and display of the written texts of the diverse languages of the modern world” (http://www.unicode.org). Thus a single XML document written in Perl can now contain Greek, Hebrew, Chinese and Russian in their proper scripts.

“XML::Parser makes it almost trivially easy for Perl programmers to process XML documents,” explained Larry Wall, who did the initial work on XML::Parser. Wall is the creator of Perl and Senior Programmer with O’Reilly & Associates. “More than that, it’s actually a repertoire of interfaces, each one optimized for a different style of processing.”

“The thing that excites me about the module is that it succeeds in combining the strengths of the best text processing language with the best hierarchical text representation,” Wall said.

Tim Bray, one of the designers of XML and co-editor of the XML specification, demonstrated XML::Parser to an enthusiastic gathering during his tutorial at the recent XML conference in Chicago. One feature of the demonstration was how XML::Parser works equally well with both Windows NT and UNIX. The conference, attended by some 1500 people this year, is the central event in the XML calendar.

“XML::Parser insulates Perl programmers, both from the details of XML syntax, and the complexities of managing an XML parser,” explained Bray. “It allows Perl programmers to create robust, sophisticated XML processing modules while writing a bare minimum of code.”

Last spring O’Reilly & Associates hosted a face-to-face meeting between Wall and Bray to discuss how Perl could support XML. The meeting resulted in a plan to make Perl the scripting language of choice for processing XML. According to Clark Cooper, who has been heavily involved in the XML::Parser work, “XML::Parser is the first milestone in that goal.”

“A lot of what has to happen in business and between businesses – documents, reports, invoices – is structured information,” Dale Dougherty, CEO of O’Reilly affiliate Songline Studios, which hosts the xml.com Web site, explained. “There’s great value in being able to automate and integrate multiple sources of information in an application which still maintains security at the source, and XML::Parser makes these processes easier to implement.”

As is usual in open source development efforts, a team of programmers cooperated to create the new module. Among the most involved were Wall; Bray; Dick Hardt, CTO and Founder of ActiveState Tool Corp.; and software engineer Clark Cooper.

For Windows, XML::Parser is available as part of the ActivePerl package at ActiveState. The XML::Parser package for Windows will be upgradeable when the next version comes out using Perl Package Manager, also included in ActivePerl. For UNIX, XML::Parser is available from the Comprehensive Perl Archive Network (CPAN) at

Related information is available at perl.oreilly.com and xml.com



Something wrong with this article? Help us out by opening an issue or pull request on GitHub