How To Parse XML File Using XPath In Java

最近一直在学习XML的Xpath解析方式，据说是一个很简单的遍历XML文件的工具，类似于sql和Oracle的关系，但是找了很多都没有找到关于Java的Xpath代码，有的都是把W3School上的文档拷贝过来的，自己也尝试过去用Java去实现遍历，但是发现有的解释不理解，直到看到了这边外国人写的博客，让我瞬间明白了，真的感谢这位哥们。。。

下面是他的原文，我测试过几个列子，都是OK了，大家都懂英文，我就没有必要再翻译过来了，呵呵。

XPath is a language for finding information in an XML file. You can say that XPath is (sort of) sql for XML files. XPath is used to navigate through elements and attributes in an XML document. You can also use XPath totraverse through an XML file in Java.

XPath comes with powerful expressions that can be used to parse an xml document and retrieve relevant information.

For demo,let us consider an xml file that holds information of employees.

<?xml version="1.0"?>
<Employees>
    <Employee emplid="1111" type="admin">
        <firstname>John</firstname>
        <lastname>Watson</lastname>
        <age>30</age>
        <email>johnwatson@sh.com</email>
    </Employee>
    <Employee emplid="2222" type="admin">
        <firstname>Sherlock</firstname>
        <lastname>Homes</lastname>
        <age>32</age>
        <email>sherlock@sh.com</email>
    </Employee>
    <Employee emplid="3333" type="user">
        <firstname>Jim</firstname>
        <lastname>Moriarty</lastname>
        <age>52</age>
        <email>jim@sh.com</email>
    </Employee>
    <Employee emplid="4444" type="user">
        <firstname>Mycroft</firstname>
        <lastname>Holmes</lastname>
        <age>41</age>
        <email>mycroft@sh.com</email>
    </Employee>
</Employees>

I have saved this file at pathC:\employees.xml. We will use this xml file in our demo and will try to fetch useful information using XPath. Before we start lets check few facts from above xml file.

There are 4 employees in our xml file
Each employee has a unique employee id defined by attributeemplid
Each employee also has an attributetypewhich defines whether an employee is admin or user.
Each employee has four child nodes:firstname,lastname,monotype; font-size: 0.9em; border: 0px; vertical-align: baseline; margin: 0px; padding: 0px 3px; color: #aa3333; background-color: #efefef;">ageandemail
Age is a number

Let’s get started…

1. Learning Java DOM Parsing API

In order to understand XPath,first we need to understand basics of DOM parsing in Java. Java provides powerful implementation of domparser in form of below API.

1.1 Creating a Java DOM XML Parser

First,we need to create a document builder usingDocumentBuilderFactoryclass. Just follow the code. It’s pretty much self explainatory.

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
//...
 
DocumentBuilderFactory builderFactory =
        DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
try {
    builder = builderFactory.newDocumentBuilder();
} catch (ParserConfigurationException e) {
    e.printStackTrace();  
}

1.2 Parsing XML with a Java DOM Parser

Once we have a document builder object. We uses it to parse XML file and create a document object.

import org.w3c.dom.Document;
import java.io.IOException;
import org.xml.sax.SAXException;
//...
 
try {
    Document document = builder.parse(
            new FileInputStream("c:\\employees.xml"));
} catch (SAXException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

In above code,we are parsing an XML file from filesystem. Sometimes you might want to parse XML specified as String value instead of reading it from file. Below code comes handy to parse XML specified as String.

String xml = ...;
Document xmlDocument = builder.parse(new ByteArrayInputStream(xml.getBytes()));

Once we have document object. We are ready to use XPath. Just create an xpath object using XPathFactory.

`nodename`	Selects all nodes with the name“nodename”
`/`	Selects from the root node
`//`	Selects nodes in the document from the current node that match the selection no matter where they are
`.`	Selects the current node
`..`	Selects the parent of the current node
`@`	Selects attributes
`employee`	Selects all nodes with the name “employee”
`employees/employee`	Selects all employee elements that are children of employees
`//employee`	Selects all book elements no matter where they are in the document

`/employees/employee[1]`	Selects the first employee element that is the child of the employees element.
`/employees/employee[last()]`	Selects the last employee element that is the child of the employees element
`/employees/employee[last()-1]`	Selects the last but one employee element that is the child of the employees element
`//employee[@type='admin']`	Selects all the employee elements that have an attribute named type with a value of ‘admin’

How To Parse XML File Using XPath In Java

1. Learning Java DOM Parsing API

1.1 Creating a Java DOM XML Parser

1.2 Parsing XML with a Java DOM Parser

1.3 Creating an XPath object

1.4 Using XPath to parse the XML

2. Learning XPath Expressions

3. Examples: Query XML document using XPath

3.1 Read firstname of all employees

3.2 Read a specific employee using employee id

3.3 Read firstname of all employees who are admin

3.4 Read firstname of all employees who are older than 40 year

3.5 Read firstname of first two employees (defined in xml file)

4. Complete Java source code

相关文章