杰瑞科技汇

iText如何将HTML转换为PDF?

Of course! Using iText with HTML is a very common and powerful way to generate PDFs. It allows you to leverage your existing web development skills (HTML, CSS) to create complex, styled documents programmatically.

iText如何将HTML转换为PDF?-图1
(图片来源网络,侵删)

Here's a comprehensive guide covering the key concepts, a practical example, and best practices.

The Core Concept: The HtmlConverter

The main class responsible for converting HTML to PDF in modern iText (7 and later) is com.itextpdf.html2pdf.HtmlConverter. It's designed to be straightforward: you provide an HTML source (as a String, URL, or InputStream), and it generates a PdfDocument.


Key Prerequisites

  1. iText 7 HTML add-on: You need the specific library for HTML processing.
  2. A valid HTML file: While basic HTML works best, iText supports a subset of HTML 5 and CSS 2.1.
  3. A License: iText requires a license for commercial use. For development and testing, you can use the AGPL license, which is free but has its own restrictions. For production, you'll need a commercial license.

Maven Dependencies

Add these to your pom.xml file. This includes the core iText 7 PDF library and the HTML-to-PDF converter.

<dependencies>
    <!-- iText 7 Core -->
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>itext7-core</artifactId>
        <version>7.2.5</version> <!-- Use the latest version -->
        <type>pom</type>
    </dependency>
    <!-- HTML to PDF Converter -->
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>html2pdf</artifactId>
        <version>7.2.5</version> <!-- Must match the core version -->
    </dependency>
    <!-- For AGPL license, you need this -->
    <dependency>
        <groupId>com.itextpdf</groupId>
        <artifactId>itextpdf-agpl</artifactId>
        <version>7.2.5</version>
    </dependency>
</dependencies>

Simple Example: Converting a String to PDF

This is the most basic example. We'll create a simple HTML string and convert it directly into a PDF file.

iText如何将HTML转换为PDF?-图2
(图片来源网络,侵删)

SimpleHtmlToPdf.java

import com.itextpdf.html2pdf.HtmlConverter;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class SimpleHtmlToPdf {
    public static void main(String[] args) {
        // 1. Define the HTML content
        String html = "<h1>Hello from iText!</h1>"
                + "<p>This PDF was generated from an HTML string using iText 7.</p>"
                + "<ul>"
                + "   <li>List item 1</li>"
                + "   <li>List item 2</li>"
                + "</ul>";
        // 2. Define the output file path
        Path outputPath = Paths.get("simple-example.pdf");
        // 3. Convert HTML to PDF
        try {
            HtmlConverter.convertToPdf(html, outputPath.toFile());
            System.out.println("PDF created successfully at: " + outputPath.toAbsolutePath());
        } catch (IOException e) {
            System.err.println("Error creating PDF: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

To run this:

  1. Save the code as SimpleHtmlToPdf.java.
  2. Make sure your pom.xml is configured with the dependencies above.
  3. Compile and run it using Maven (mvn compile exec:java -Dexec.mainClass="SimpleHtmlToPdf") or your IDE.
  4. A file named simple-example.pdf will be created in your project's root directory.

Advanced Example: Using an HTML File and CSS

In a real-world scenario, your HTML and CSS will likely be in separate files. This example shows how to convert an external HTML file and apply an external CSS stylesheet.

Project Structure:

iText如何将HTML转换为PDF?-图3
(图片来源网络,侵删)
your-project/
├── src/
│   └── main/
│       └── java/
│           └── AdvancedHtmlToPdf.java
├── resources/
│   ├── invoice.html
│   └── styles.css
└── pom.xml

The HTML File (resources/invoice.html)

<!DOCTYPE html>
<html>
<head>Invoice #123</title>
    <link rel="stylesheet" type="text/css" href="styles.css">
</head>
<body>
    <div class="invoice-header">
        <h1>INVOICE</h1>
        <p>Invoice #: 123</p>
        <p>Date: October 26, 2025</p>
    </div>
    <table class="invoice-table">
        <thead>
            <tr>
                <th>Item</th>
                <th>Quantity</th>
                <th>Price</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td>Professional Service</td>
                <td>10 hours</td>
                <td>$1,500.00</td>
            </tr>
            <tr>
                <td>Hardware</td>
                <td>1 unit</td>
                <td>$500.00</td>
            </tr>
        </tbody>
        <tfoot>
            <tr>
                <td colspan="2"><strong>Total</strong></td>
                <td><strong>$2,000.00</strong></td>
            </tr>
        </tfoot>
    </table>
</body>
</html>

The CSS File (resources/styles.css)

body {
    font-family: Arial, sans-serif;
    margin: 40px;
    color: #333;
}
.invoice-header {
    border-bottom: 2px solid #000;
    padding-bottom: 10px;
    margin-bottom: 20px;
}
.invoice-table {
    width: 100%;
    border-collapse: collapse;
    margin-top: 20px;
}
.invoice-table th, .invoice-table td {
    border: 1px solid #ddd;
    padding: 8px;
    text-align: left;
}
.invoice-table th {
    background-color: #f2f2f2;
    font-weight: bold;
}
.invoice-table tfoot td {
    font-weight: bold;
    text-align: right;
}

The Java Code (src/main/java/AdvancedHtmlToPdf.java)

import com.itextpdf.html2pdf.HtmlConverter;
import java.io.File;
import java.io.IOException;
import java.net.URL;
public class AdvancedHtmlToPdf {
    public static void main(String[] args) {
        // Get the file paths
        ClassLoader classLoader = AdvancedHtmlToPdf.class.getClassLoader();
        URL htmlResource = classLoader.getResource("invoice.html");
        URL cssResource = classLoader.getResource("styles.css");
        if (htmlResource == null || cssResource == null) {
            System.err.println("Error: Could not find HTML or CSS resource files.");
            return;
        }
        File htmlFile = new File(htmlResource.getFile());
        File cssFile = new File(cssResource.getFile());
        File outputFile = new File("advanced-example.pdf");
        try {
            // Convert HTML to PDF, providing the CSS file
            HtmlConverter.convertToPdf(
                htmlFile,
                outputFile,
                new ConverterProperties().setBaseUri(cssFile.getParent())
            );
            System.out.println("Advanced PDF created successfully at: " + outputFile.getAbsolutePath());
        } catch (IOException e) {
            System.err.println("Error creating PDF: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Explanation of the Java Code:

  • getResource(...): Loads the files from the resources directory.
  • ConverterProperties: This is a crucial class for configuring the conversion.
  • setBaseUri(...): This is very important. It tells iText where to find relative resources like CSS files, images, and fonts. We set it to the parent directory of our CSS file, which is also where the HTML file is located.

Important Concepts and Customization

Handling Images

To include images in your HTML (<img src="my-image.png">), you must provide the correct baseUri in the ConverterProperties so iText can locate the image file.

Handling Fonts

iText can use system fonts or load custom fonts (TTF, OTF). You can register fonts in the ConverterProperties.

// Register a custom font
ConverterProperties properties = new ConverterProperties();
FontProvider fontProvider = new DefaultFontProvider(false, true, false);
fontProvider.addDirectory("/path/to/fonts");
properties.setFontProvider(fontProvider);
HtmlConverter.convertToPdf(html, outputFile, properties);

Page Setup (Margins, Size, Orientation)

You can control the page layout using WriterProperties and Document settings.

ConverterProperties properties = new ConverterProperties();
// Set page margins (left, right, top, bottom)
properties.setMargins(new Margins(20, 20, 30, 30));
// Set page size and orientation
Document document = new Document(new PdfDocument(new PdfWriter(outputFile)), PageSize.A4.rotate());
// PageSize.A4 for portrait, PageSize.A4.rotate() for landscape
HtmlConverter.convertToDocument(html, document, properties);

JavaScript and Event Handlers

iText does not execute JavaScript. It will render the elements but ignore any onclick or other event handlers.

Unsupported HTML/CSS

iText's HTML support is powerful but not a complete browser engine. It supports most common tags and CSS properties, but you may encounter some limitations. For complex layouts, sometimes simpler CSS works better than highly complex, browser-specific hacks.


iText 7 vs. iText 5 (Legacy)

Feature iText 7 (Recommended) iText 5 (Legacy)
HTML to PDF HtmlConverter (in html2pdf add-on) XMLWorkerHelper
Architecture Modern, modular, immutable objects Older, mutable objects
License AGPL or Commercial AGPL or Commercial
API More fluent, builder-style patterns More verbose, traditional setters/getters
Support Actively developed Bug fixes only

If you are starting a new project, you must use iText 7. The HtmlConverter is significantly more robust and easier to use than the old XMLWorkerHelper.

Summary

  1. Add Dependencies: Include itext7-core and html2pdf in your Maven/Gradle setup.
  2. Use HtmlConverter: The main entry point for conversion.
  3. Provide HTML Source: Can be a String, File, or URL.
  4. Configure with ConverterProperties: This is key for advanced features.
    • Use setBaseUri() for relative paths (CSS, images).
    • Use setFontProvider() for custom fonts.
  5. Handle Output: Write to a File or an OutputStream.
  6. Remember the License: Be mindful of the AGPL's requirements for commercial applications.
分享:
扫描分享到社交APP
上一篇
下一篇