杰瑞科技汇

Java XML元素如何高效解析与操作?

Of course! In Java, working with XML elements is a fundamental task. There are two primary ways to handle this:

Java XML元素如何高效解析与操作?-图1
(图片来源网络,侵删)
  1. DOM (Document Object Model): Parses the entire XML document into a tree-like structure in memory. This is great for small to medium-sized files where you need to navigate, modify, or query the document in any direction.
  2. SAX (Simple API for XML): An event-based parser that reads the XML document sequentially from top to bottom. It's very memory-efficient for large files because it doesn't load the whole document into memory, but it's read-only and more complex to use.

Let's dive into each approach with clear examples.


DOM Parsing (Using javax.xml)

The DOM approach represents the XML as a tree of objects. The main interface for an element is org.w3c.dom.Element.

Key Classes and Interfaces:

  • Document: Represents the entire XML document. It's the root of the DOM tree.
  • Element: Represents an XML element (e.g., <book>, <author>). This is what you'll work with most.
  • Node: The base interface for all DOM objects (Element, Text, Attribute, etc.).
  • DocumentBuilder: A factory class used to parse an XML source and create a Document object.

Example XML File (books.xml)

Let's use this sample XML for our examples.

<?xml version="1.0" encoding="UTF-8"?>
<catalog>
    <book id="bk101">
        <author>Gambardella, Matthew</author>
        <title>XML Developer's Guide</title>
        <genre>Computer</genre>
        <price>44.95</price>
        <publish_date>2000-10-01</publish_date>
    </book>
    <book id="bk102">
        <author>Ralls, Kim</author>
        <title>Midnight Rain</title>
        <genre>Fantasy</genre>
        <price>5.95</price>
        <publish_date>2000-12-16</publish_date>
    </book>
</catalog>

Example 1: Reading XML Elements with DOM

This code will read books.xml, find all <book> elements, and print their details.

Java XML元素如何高效解析与操作?-图2
(图片来源网络,侵删)
import org.w3c.dom.*;
import javax.xml.parsers.*;
import java.io.File;
import java.io.IOException;
public class DomReaderExample {
    public static void main(String[] args) {
        try {
            // 1. Create a DocumentBuilderFactory
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            // 2. Create a DocumentBuilder
            DocumentBuilder builder = factory.newDocumentBuilder();
            // 3. Parse the XML file to create a Document object
            Document document = builder.parse(new File("books.xml"));
            // 4. Normalize the XML structure (optional but recommended)
            document.getDocumentElement().normalize();
            // 5. Get all elements with the tag name "book"
            NodeList bookList = document.getElementsByTagName("book");
            System.out.println("Total books found: " + bookList.getLength());
            // 6. Loop through each book element
            for (int i = 0; i < bookList.getLength(); i++) {
                Node node = bookList.item(i);
                // Ensure it's an Element node
                if (node.getNodeType() == Node.ELEMENT_NODE) {
                    Element bookElement = (Element) node;
                    // Get the 'id' attribute
                    String id = bookElement.getAttribute("id");
                    System.out.println("\nBook ID: " + id);
                    // Get child elements by tag name
                    String author = getTagValue("author", bookElement);
                    String title = getTagValue("title", bookElement);
                    String genre = getTagValue("genre", bookElement);
                    String price = getTagValue("price", bookElement);
                    System.out.println("Title: " + title);
                    System.out.println("Author: " + author);
                    System.out.println("Genre: " + genre);
                    System.out.println("Price: " + price);
                }
            }
        } catch (ParserConfigurationException | IOException | SAXException e) {
            e.printStackTrace();
        }
    }
    // Helper method to get the text content of a child element
    private static String getTagValue(String tagName, Element element) {
        NodeList nodeList = element.getElementsByTagName(tagName).item(0).getChildNodes();
        Node node = nodeList.item(0);
        return node.getNodeValue();
    }
}

Example 2: Creating and Writing XML with DOM

This example creates a new XML document from scratch and writes it to a file.

import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import java.io.File;
public class DomWriterExample {
    public static void main(String[] args) {
        try {
            // 1. Create a DocumentBuilderFactory and DocumentBuilder
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            // 2. Create a new Document
            Document document = builder.newDocument();
            // 3. Create the root element
            Element rootElement = document.createElement("users");
            document.appendChild(rootElement);
            // 4. Create a child element
            Element user1 = document.createElement("user");
            user1.setAttribute("id", "1");
            rootElement.appendChild(user1);
            // 5. Create elements for user details
            Element name1 = document.createElement("name");
            name1.appendChild(document.createTextNode("Alice"));
            user1.appendChild(name1);
            Element email1 = document.createElement("email");
            email1.appendChild(document.createTextNode("alice@example.com"));
            user1.appendChild(email1);
            // 6. Add another user
            Element user2 = document.createElement("user");
            user2.setAttribute("id", "2");
            rootElement.appendChild(user2);
            Element name2 = document.createElement("name");
            name2.appendChild(document.createTextNode("Bob"));
            user2.appendChild(name2);
            Element email2 = document.createElement("email");
            email2.appendChild(document.createTextNode("bob@example.com"));
            user2.appendChild(email2);
            // 7. Write the content into an XML file
            TransformerFactory transformerFactory = TransformerFactory.newInstance();
            Transformer transformer = transformerFactory.newTransformer();
            transformer.setOutputProperty(OutputKeys.INDENT, "yes"); // Pretty print
            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
            DOMSource source = new DOMSource(document);
            StreamResult result = new StreamResult(new File("users.xml"));
            transformer.transform(source, result);
            System.out.println("File saved!");
        } catch (ParserConfigurationException | TransformerException e) {
            e.printStackTrace();
        }
    }
}

SAX Parsing (Event-Driven)

SAX is a read-only parser. It doesn't create a tree in memory. Instead, it fires events as it encounters different parts of the XML document (e.g., start of an element, end of an element, text content). You implement a ContentHandler to react to these events.

Key Classes and Interfaces:

  • SAXParser: The main parser class.
  • DefaultHandler: A convenient base class you can extend to override event-handling methods.
  • startElement(): Called when the parser encounters the start of an element (e.g., <book>).
  • endElement(): Called when the parser encounters the end of an element (e.g., </book>).
  • characters(): Called to get the text content between an element's tags.

Example: Reading XML with SAX

This example does the same thing as the DOM reader but uses much less memory.

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.*;
import java.io.File;
import java.io.IOException;
public class SaxReaderExample {
    public static void main(String[] args) {
        try {
            // 1. Create a SAXParserFactory and SAXParser
            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();
            // 2. Create a handler (our custom class)
            DefaultHandler handler = new BookHandler();
            // 3. Parse the file with the handler
            saxParser.parse(new File("books.xml"), handler);
        } catch (ParserConfigurationException | SAXException | IOException e) {
            e.printStackTrace();
        }
    }
}
// Custom handler to process the XML events
class BookHandler extends DefaultHandler {
    private boolean inBook = false;
    private boolean inAuthor = false;
    private boolean inTitle = false;
    private StringBuilder currentValue = new StringBuilder();
    // Called when the parser starts parsing an element
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if (qName.equalsIgnoreCase("book")) {
            inBook = true;
            System.out.println("\nFound a book with ID: " + attributes.getValue("id"));
        } else if (inBook && qName.equalsIgnoreCase("author")) {
            inAuthor = true;
        } else if (inBook && qName.equalsIgnoreCase("title")) {
            inTitle = true;
        }
    }
    // Called to get the text content of an element
    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        if (inAuthor || inTitle) {
            currentValue.append(ch, start, length);
        }
    }
    // Called when the parser ends parsing an element
    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (qName.equalsIgnoreCase("book")) {
            inBook = false;
        } else if (inBook && qName.equalsIgnoreCase("author")) {
            inAuthor = false;
            System.out.println("Author: " + currentValue.toString().trim());
            currentValue.setLength(0); // Clear the StringBuilder
        } else if (inBook && qName.equalsIgnoreCase("title")) {
            inTitle = false;
            System.out.println("Title: " + currentValue.toString().trim());
            currentValue.setLength(0); // Clear the StringBuilder
        }
    }
}

Modern Alternative: Jackson for XML (Recommended for new projects)

For many applications, especially those dealing with data binding (converting XML to/from Java objects), a library like Jackson with its jackson-dataformat-xml module is a much more convenient and powerful choice.

Java XML元素如何高效解析与操作?-图3
(图片来源网络,侵删)

Step 1: Add Dependency (Maven)

<dependency>
    <groupId>com.fasterxml.jackson.dataformat</groupId>
    <artifactId>jackson-dataformat-xml</artifactId>
    <version>2.15.2</version> <!-- Use the latest version -->
</dependency>

Step 2: Create Java Classes (POJOs)

Create classes that mirror the XML structure. Jackson will map between them automatically.

import com.fasterxml.jackson.annotation.JsonProperty;
import com.fasterxml.jackson.dataformat.xml.annotation.JacksonXmlProperty;
import com.fasterxml.jackson.dataformat.xml.annotation.JacksonXmlRootElement;
import java.util.List;
// Maps to the <catalog> root element
@JacksonXmlRootElement(localName = "catalog")
public class Catalog {
    private List<Book> book;
    // Getters and Setters
    public List<Book> getBook() { return book; }
    public void setBook(List<Book> book) { this.book = book; }
}
// Maps to the <book> element
public class Book {
    @JacksonXmlProperty(isAttribute = true) // Maps to the 'id' attribute
    private String id;
    @JsonProperty("author") // Maps to <author> tag
    private String author;
    @JsonProperty("title")
    private String title;
    @JsonProperty("genre")
    private String genre;
    @JsonProperty("price")
    private double price;
    // Getters and Setters
    public String getId() { return id; }
    public void setId(String id) { this.id = id; }
    public String getAuthor() { return author; }
    public void setAuthor(String author) { this.author = author; }
    public String getTitle() { return title; }
    public void setTitle(String title) { this.title = title; }
    public String getGenre() { return genre; }
    public void setGenre(String genre) { this.genre = genre; }
    public double getPrice() { return price; }
    public void setPrice(double price) { this.price = price; }
}

Step 3: Use Jackson to Parse XML

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import java.io.File;
import java.util.List;
public class JacksonXmlExample {
    public static void main(String[] args) {
        try {
            // Create an XmlMapper instance
            XmlMapper xmlMapper = new XmlMapper();
            // Read the XML file into a Catalog object
            Catalog catalog = xmlMapper.readValue(new File("books.xml"), Catalog.class);
            // Now you can work with the Java objects
            List<Book> books = catalog.getBook();
            for (Book book : books) {
                System.out.println("Book Title: " + book.getTitle());
                System.out.println("  - Author: " + book.getAuthor());
                System.out.println("  - Price: $" + book.getPrice());
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Summary: Which One to Choose?

Feature DOM (javax.xml) SAX (org.xml.sax) Jackson (com.fasterxml.jackson)
Memory Usage High (loads entire doc) Very Low (streaming) Medium (loads into POJOs)
Read/Write Read and Write Read-only Read and Write
Navigation Full, random access (tree) Sequential only Via Java object fields
Ease of Use Moderate for simple tasks, complex for modifications Complex, requires state management Very Easy (with POJOs)
Best For Small/medium docs, need to modify, complex queries. Very large XML files where memory is a concern. Most modern applications, especially when you need to bind XML to Java objects (data binding).
分享:
扫描分享到社交APP
上一篇
下一篇