Hudzilla.org - the homepage of Paul Hudson
Contents > XML & XSLT > SimpleXML Wish List | Report Bug | About Me ]

12.3.3     Searching and filtering with XPath

This is NOT the latest copy of this book; click here for the latest version.

array xpath ( string path)

The standard way for searching through XML documents for particular nodes is called XPath, and Sterling Hughes (the creator of the SimpleXML extension) described it saying it's "as important to XML as regular expressions are to plain text".

Fortunately for us, XPath is a darn sight easier than regular expressions for basic usage. That said, it might take a little while to get your head around all the possibilities it opens up to you!

Using the same employees.xml file, give this script a try:

<?php
    $xml
= simplexml_load_file('employees.xml');

    echo
"<B>Using direct method...</B><BR />";
    
$names = $xml->xpath('/employees/employee/name');
    foreach(
$names as $name) {
        echo
"Found $name<BR />";
    }
    echo
"<BR />";

    echo
"<B>Using indirect method...</B><BR />";
    
$employees = $xml->xpath('/employees/employee');
    foreach(
$employees as $employee) {
        echo
"Found {$employee->name}<BR />";
    }
    echo
"<BR />";

    echo
"<B>Using wildcard method...</B><BR />";
    
$names = $xml->xpath('//name');
    foreach(
$names as $name) {
        echo
"Found $name<BR />";
    }
?>

What that does is pull out names of employees in three different ways. The key real work is done in the call to the xpath() function. As you can see in the prototype, xpath() takes a query as its only parameter, and returns the result of that query.

The query itself has specialised syntax, but it's very easy. The first example says "Look in all the employees elements, find any employee elements in there, and retrieve all the names of them." It's very specific because only employees/employee/name is matched.

The second query matches all employee elements inside employees, but doesn't go specifically for the name of the employees. As a result, we get the full employee back, and need to print $employee->name to get the name.

The last one just looks for name elements, but note that it starts with "//" - this is the signal to do a global search for all name elements, regardless of where they are or how deeply nested they are in the document.

So, what we have here is the ability to grab specific parts of a document very easily, but that's really only the start of XPath's coolness. You see, you can also use it to filter your results according to any values you want. Try this script out:

<?php
    $xml
= simplexml_load_file('employees.xml');

    echo
"<B>Matching employees with name 'Laura Pollard'</B><BR />";
    
$employees = $xml->xpath('/employees/employee[name="Laura Pollard"]');

    foreach(
$employees as $employee) {
        echo
"Found {$employee->name}<BR />";
    }

    echo
"<BR />";

    echo
"<B>Matching employees younger than 54</B><BR />";
    
$employees = $xml->xpath('/employees/employee[age<54]');

    foreach(
$employees as $employee) {
        echo
"Found {$employee->name}<BR />";
    }

    echo
"<BR />";

    echo
"<B>Matching employees as old or older than 48</B><BR />";
    
$employees = $xml->xpath('//employee[age>=48]');

    foreach(
$employees as $employee) {
        echo
"Found {$employee->name}<BR />";
    }

    echo
"<BR />";

?>

Let's break that down to see how the querying actually works. The key part, is, of course, between the square brackets, [ and ]. The first query grabs all employees elements, then all employee elements inside it, but then filters them so that only those that have a name that matches Laura Pollard. Once you get that, the other two are quite obvious: <, >, <=, etc, all work as you'd expect in PHP. Note that I slipped in a double slash in the last example to show you that the global search notation works here too.

You can grab only part of a query result by continuing on as normal afterwards, like this:

$ages = $xml->xpath('//employee[age>=48]/age');

foreach(
$ages as $age) {
    echo
"Found $age
";
}

You can even run queries on queries, with an XPath search like this:

$employees = $xml->xpath('//employee[age>=49][name="Laura Pollard"]');

Going back to selecting various types of elements, you can use the | symbol (OR) to select more than one type of element, like this:

echo " Retrieving all titles and ages ";
$results = $xml->xpath('//employee/title|//employee/age');

foreach($results as $result) {
    echo "
Found $result
";
}

That will output the following:

Found Chief Information Officer
Found 48
Found Chief Executive Officer
Found 54

You can, of course, combine all of this together to do search on more than one value, like this:

$names = $xml->xpath('//employee[age<40]/name|//employee[age>50]/name');

foreach(
$names as $name) {
    echo
"Found $name
";
}

For maximum insanity, you can actually run calculations using XPath in order to get tighter control over your queries. For example, if you only wanted the names of employees who have an odd age (that is, cannot be divided by two without leaving a remainder) you would use an XPath query like this:

$names = $xml->xpath('//employee[age mod 2 = 1]/name');

Along with "mod" (equivalent to % in PHP) there's also "div" for division, + and - (same as PHP, except that - must always have whitespace either side of it as it may be confused with an element name), and ceiling() and floor() (equivalent to ceil() and floor() in PHP). These are quite advanced and don't really get that much use in practice.

However, there is a lot, lot more you can do with XPath, and for that I suggest you check out the Further Reading section!





<< 12.3.2 Reading from a string   12.3.4 Outputting XML: asXML() >>
Table of Contents
Want to see this stuff in print? PHP in a Nutshell takes the core topics covered here, adds in thousands of edits from the editorial team and myself, and combines them to make an unbeatable reference for PHP programmers at all levels.



My latest book has hundreds more tips on how to use PHP, Apache, and MySQL, plus Perl, Python, shell scripts, performance tuning, and more!



Top-right shadow
 
Bottom-left shadow Bottom shadow

Comments from other readers
satheeshskumar - 07 Sep 2008

I have an application for job evaluation. Each job evaluation submision will create new record into xml file (example record pated below).

<?xml version="1.0"?>
<evaluation>
<Position>
<position>Admin Manager</position>
</Position>
<evaluation>
<education>E</education>
<knowledge>J</knowledge>
<problem_solving>E</problem_solving>
<guidance_received>H</guidance_received>
<skills_level>C</skills_level>
<scope_of_contact>A</scope_of_contact>
<degree_of_contact>A</degree_of_contact>
<job_control>L</job_control>
<span_of_control>G</span_of_control>
<discretion_level>M</discretion_level>
<result_of_decision>C</result_of_decision>
<employers_coverage>A</employers_coverage>
</evaluation>
</evaluation>

My issue is , how to update/delete if an existing position used for valiation.

I will appreciate if somebody can help me.

satheeshskumar - 07 Sep 2008

I have an application for job evaluation. Each job evaluation submision will create new record into xml file (example record pated below).

<?xml version="1.0"?>
<evaluation>
<Position>
<position>Admin Manager</position>
</Position>
<evaluation>
<education>E</education>
<knowledge>J</knowledge>
<problem_solving>E</problem_solving>
<guidance_received>H</guidance_received>
<skills_level>C</skills_level>
<scope_of_contact>A</scope_of_contact>
<degree_of_contact>A</degree_of_contact>
<job_control>L</job_control>
<span_of_control>G</span_of_control>
<discretion_level>M</discretion_level>
<result_of_decision>C</result_of_decision>
<employers_coverage>A</employers_coverage>
</evaluation>
</evaluation>

My issue is , how to update/delete if an existing position used for valiation.

I will appreciate if somebody can help me.

satheeshskumar - 07 Sep 2008

I have an application for job evaluation. Each job evaluation submision will create new record into xml file (example record pated below).

<?xml version="1.0"?>
<evaluation>
<Position>
<position>Admin Manager</position>
</Position>
<evaluation>
<education>E</education>
<knowledge>J</knowledge>
<problem_solving>E</problem_solving>
<guidance_received>H</guidance_received>
<skills_level>C</skills_level>
<scope_of_contact>A</scope_of_contact>
<degree_of_contact>A</degree_of_contact>
<job_control>L</job_control>
<span_of_control>G</span_of_control>
<discretion_level>M</discretion_level>
<result_of_decision>C</result_of_decision>
<employers_coverage>A</employers_coverage>
</evaluation>
</evaluation>

My issue is , how to update/delete if an existing position used for valiation.

I will appreciate if somebody can help me.

Newbie... - 07 Sep 2008

I have some simple xml:

<users>
<title>Floor1</title>
<user>
<fullname>John Doe</fullname>
<ID>Binary Avenue 1234 FL</ID>
<email>john@john-domain.com</email>
<lat>2</lat>
<lon>4</lon>
</user>
</users>

I am trying to retrieve the value of "lat" where "fullname=John Doe" using this PHP based xpath expression:

$lat = $users->xpath('/users/user/[fullname="Janet Smith"]');

Apparently my expression is invalid.

Any ideas?

QF

Newbie... - 07 Sep 2008

I have some simple xml:

<users>
<title>Floor1</title>
<user>
<fullname>John Doe</fullname>
<ID>Binary Avenue 1234 FL</ID>
<email>john@john-domain.com</email>
<lat>2</lat>
<lon>4</lon>
</user>
</users>

I am trying to retrieve the value of "lat" where "fullname=John Doe" using this PHP based xpath expression:

$lat = $users->xpath('/users/user/[fullname="Janet Smith"]');

Apparently my expression is invalid.

Any ideas?

QF

Tim - 07 Sep 2008

Following the method described in the article is is possible to access the value of an attribute?

Eg:
<name age="49">Laura Pollard</name>

I need the value of age.

Thanks

Rodrigo - 07 Sep 2008

$dom->load("example.xml");
$xpath = new Domxpath($dom);

//this works as expected
$result = $xpath->query("//dc:subject");
foreach ($result as $title) {
print $title->nodeName. "\n";
}

// PS: please delete the duplicate

Rodrigo - 07 Sep 2008

http://php.belnet.be/manual/en/function.dom-domxpath-construct.php


it's the Domxpath object

Rodrigo - 07 Sep 2008

http://php.belnet.be/manual/en/function.dom-domxpath-construct.php


it's the Domxpath object

A PHP User - 07 Sep 2008

I've searched in vain, but I can't find any tutorials on taking XPath to the next step. Specifically, I want to perform date calculates with the dateTime functions here:
http://www.w3schools.com/xpath/xpath_functions.asp

How does one do this?



Add comment
Please note that by posting a comment here you are committing it to the public domain. This is important so that others can make use of your code themselves, and also so that I can incorporate helpful notes directly into the main text. Comments are limited to 2000 characters in length.

If you are reporting an error in the content, please tell me directly.

Your name/email address:
Your comment:
 
Now, in order to verify that you're a real person, please answer this simple question: what is one plus five?
The answer is:
(please write in
numbers, eg 19)


Top-right shadow
 
Bottom-left shadow Bottom shadow