Introduction to XML

As XML is not a binary format, it needs to be parsed by PHP before the data it holds can be used. PHP offers two very different ways to work with XML documents - event-based parsing, which makes use of the Expat library (as seen in Apache and Mozilla), and tree-based parsing (known as "DOM XML").

You may wonder why there are two options when it comes to XML parsing, so let me explain what the difference between the two is:

Event-based parsers work through XML documents and call a special function you specify every time a new XML item is found, for example "start of element found". PHP allows you to specify these callback functions for a variety of events, including start of element, end of element, character data, namespaces, and more.

Tree-based parsers parse the entire document into a virtual tree, accessible through the document object module (DOM). For example, if an XML document for a news site had many channels, and each channel had sections, and each section had news items, you could refer to document.channels[3].sections[1].newsitem[2].

Generally speaking, tree-based parsers are slower than event-based parsers, but they are more useful in the long run because you can navigate easily through the tree they create. It is possible, however, to use an event-based parser to create a tree by writing the appropriate code for each callback function.

 

Want to learn PHP 7?

Hacking with PHP has been fully updated for PHP 7, and is now available as a downloadable PDF. Get over 1200 pages of hands-on PHP learning today!

If this was helpful, please take a moment to tell others about Hacking with PHP by tweeting about it!

Next chapter: Event-based parsing >>

Previous chapter: XML & XSLT

Jump to:

 

Home: Table of Contents

Copyright ©2015 Paul Hudson. Follow me: @twostraws.