杰瑞科技汇

Java Excel转PDF,如何实现?

Of course! Converting an Excel file (.xls or .xlsx) to a PDF in Java is a common task for generating reports, invoices, or any document that requires a fixed layout and universal readability.

Java Excel转PDF,如何实现?-图1
(图片来源网络,侵删)

There are several excellent libraries to achieve this. I'll cover the most popular and effective ones, from the most feature-rich to simpler alternatives.

Here are the main options:

  1. Apache POI (for .xls/.xlsx) + iText / PDFBox (for PDF Generation): The classic, powerful, and flexible approach. You read the Excel file and then manually create the PDF by drawing the content. It gives you full control but requires more code.
  2. Aspose.Cells: A commercial library specifically designed for working with Excel files. It has a direct save() method to convert to PDF, making it incredibly easy and often producing high-fidelity results.
  3. Spire.XLS for Java: Another popular commercial library with a very simple API for converting Excel to PDF.
  4. EasyExcel (by Alibaba) + PDFBox: A good open-source combination. EasyExcel is fantastic for reading/writing large Excel files with low memory consumption. You would then use its data to generate a PDF with PDFBox.

Recommendation Summary

Library Type Ease of Use Performance Fidelity (Layout) Cost
Aspose.Cells Commercial ⭐⭐⭐⭐⭐ (Very Easy) Excellent ⭐⭐⭐⭐⭐ (Best) Paid (Free Trial)
Spire.XLS Commercial ⭐⭐⭐⭐⭐ (Very Easy) Excellent ⭐⭐⭐⭐⭐ (Best) Paid (Free Trial)
Apache POI + PDFBox Open Source ⭐⭐ (Complex) Good ⭐⭐⭐ (Manual Effort) Free
EasyExcel + PDFBox Open Source ⭐⭐⭐ (Medium) Excellent (for large files) ⭐⭐⭐ (Manual Effort) Free

For most projects, Aspose.Cells is the best choice if budget allows due to its simplicity and high-quality output. For open-source projects, Apache POI + PDFBox is the standard, though it requires more work.


Method 1: Aspose.Cells (Recommended for Commercial Projects)

This is the simplest and most robust method. The library handles all the complexities of mapping Excel elements (cells, fonts, colors, images) to PDF.

Java Excel转PDF,如何实现?-图2
(图片来源网络,侵删)

Step 1: Add Dependency

Add the Aspose.Cells JAR to your project. If you're using Maven, add this to your pom.xml:

<dependency>
    <groupId>com.aspose</groupId>
    <artifactId>aspose-cells</artifactId>
    <version>24.4</version> <!-- Use the latest version -->
</dependency>

(You can download the JAR from the Aspose website if you're not using Maven. A free trial is available for evaluation.)

Step 2: Java Code

The code is remarkably simple.

import com.aspose.cells.*;
import java.io.*;
public class AsposeExcelToPdf {
    public static void main(String[] args) {
        // Define input and output file paths
        String excelFilePath = "input.xlsx";
        String pdfFilePath = "output.pdf";
        // 1. Load the Excel workbook
        Workbook workbook = new Workbook(excelFilePath);
        // 2. Save the workbook directly to PDF format
        // You can also specify PDF options for more control
        PdfSaveOptions pdfSaveOptions = new PdfSaveOptions();
        // For example, to make it compliant with PDF/A-1a:
        // pdfSaveOptions.setCompliance(PdfCompliance.PDF_A_1A);
        workbook.save(pdfFilePath, pdfSaveOptions);
        System.out.println("Excel to PDF conversion successful: " + pdfFilePath);
    }
}

Pros:

  • Extremely simple, one-line conversion.
  • High fidelity; it preserves the look and feel of the Excel sheet very well.
  • Handles complex layouts, merged cells, and images effectively.

Cons:

  • It's a commercial library and requires a license for production use.

Method 2: Apache POI (for Excel) + PDFBox (for PDF) - Open Source

This approach gives you full control but requires more code. You read the Excel data with POI and then manually draw it onto a PDF document using PDFBox.

Step 1: Add Dependencies

Add both POI and PDFBox to your pom.xml:

<!-- Apache POI for Excel -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>5.2.5</version> <!-- Use the latest version -->
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.2.5</version>
</dependency>
<!-- PDFBox for PDF -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>3.0.2</version> <!-- Use the latest version -->
</dependency>

Step 2: Java Code

This example demonstrates a basic conversion. A production-ready version would need to handle more complex scenarios like cell spanning, different fonts, colors, etc.

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
public class PoiExcelToPdf {
    public static void main(String[] args) {
        String excelFilePath = "input.xlsx";
        String pdfFilePath = "output.pdf";
        try (InputStream excelFile = new FileInputStream(excelFilePath);
             Workbook workbook = new XSSFWorkbook(excelFile);
             PDDocument pdfDocument = new PDDocument()) {
            // Get the first sheet from the Excel workbook
            Sheet sheet = workbook.getSheetAt(0);
            // Create a new page for the PDF
            PDPage page = new PDPage();
            pdfDocument.addPage(page);
            // Start writing content to the PDF page
            try (PDPageContentStream contentStream = new PDPageContentStream(pdfDocument, page)) {
                contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
                contentStream.beginText();
                contentStream.newLineAtOffset(50, 750); // (x, y) coordinates
                // Loop through each row of the Excel sheet
                for (Row row : sheet) {
                    // Loop through each cell in the row
                    for (Cell cell : row) {
                        // Get cell value and type
                        switch (cell.getCellType()) {
                            case STRING:
                                contentStream.showText(cell.getStringCellValue() + "\t");
                                break;
                            case NUMERIC:
                                if (DateUtil.isCellDateFormatted(cell)) {
                                    contentStream.showText(cell.getDateCellValue().toString() + "\t");
                                } else {
                                    contentStream.showText(String.valueOf(cell.getNumericCellValue()) + "\t");
                                }
                                break;
                            case BOOLEAN:
                                contentStream.showText(String.valueOf(cell.getBooleanCellValue()) + "\t");
                                break;
                            case FORMULA:
                                contentStream.showText(cell.getCellFormula() + "\t");
                                break;
                            default:
                                contentStream.showText("\t");
                        }
                    }
                    contentStream.newLineAtOffset(0, -15); // Move to the next line
                }
                contentStream.endText();
            }
            // Save the PDF document
            pdfDocument.save(pdfFilePath);
            System.out.println("Excel to PDF conversion successful: " + pdfFilePath);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Pros:

  • Completely free and open-source.
  • Full control over the final PDF output.
  • POI is the industry standard for Java Excel manipulation.

Cons:

  • Very complex. You are responsible for layout, formatting, pagination, etc.
  • A lot of manual coding is required to match the Excel appearance.
  • Can be inefficient for very large Excel files if not handled carefully.

Method 3: EasyExcel (by Alibaba) + PDFBox - Open Source (for Large Files)

This is a great alternative if you are dealing with very large Excel files (e.g., millions of rows) because EasyExcel uses a SAX (event-based) model to read files with very low memory consumption.

Step 1: Add Dependencies

Add EasyExcel and PDFBox to your pom.xml:

<!-- EasyExcel for reading large Excel files -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>easyexcel</artifactId>
    <version>3.3.2</version> <!-- Use the latest version -->
</dependency>
<!-- PDFBox for PDF generation -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>3.0.2</version>
</dependency>

Step 2: Java Code

The logic is similar to the POI+PDFBox example, but you use EasyExcel's ExcelReader to process the data.

import com.alibaba.excel.EasyExcel;
import com.alibaba.excel.context.AnalysisContext;
import com.alibaba.excel.event.AnalysisEventListener;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class EasyExcelToPdf {
    public static void main(String[] args) {
        String excelFilePath = "input_large.xlsx";
        String pdfFilePath = "output_large.pdf";
        // Using EasyExcel to read data (memory-efficient for large files)
        List<List<String>> data = new ArrayList<>();
        EasyExcel.read(excelFilePath, new AnalysisEventListener<List<String>>() {
            @Override
            public void invoke(List<String> list, AnalysisContext analysisContext) {
                data.add(list);
            }
            @Override
            public void doAfterAllAnalysed() {
                // This method is called after all data is read
            }
        }).sheet().doRead();
        // Now, use PDFBox to write the data to a PDF
        try (PDDocument pdfDocument = new PDDocument()) {
            PDPage page = new PDPage();
            pdfDocument.addPage(page);
            try (PDPageContentStream contentStream = new PDPageContentStream(pdfDocument, page)) {
                contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
                contentStream.beginText();
                contentStream.newLineAtOffset(50, 750);
                for (List<String> rowData : data) {
                    for (String cellValue : rowData) {
                        contentStream.showText(cellValue + "\t");
                    }
                    contentStream.newLineAtOffset(0, -15);
                }
                contentStream.endText();
            }
            pdfDocument.save(pdfFilePath);
            System.out.println("Large Excel to PDF conversion successful: " + pdfFilePath);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Pros:

  • Excellent performance and low memory usage for very large Excel files.
  • Free and open-source.

Cons:

  • Still requires manual PDF generation logic (like POI+PDFBox).
  • The EasyExcel event listener model can be less intuitive for simple, one-off conversions.

Final Recommendation

  • For most applications: Use Aspose.Cells. The time saved in development and the high-quality output is worth the cost for most commercial projects.
  • For open-source projects or tight budgets: Use Apache POI + PDFBox if the Excel files are not extremely large and you have the time to implement the layout logic. Use EasyExcel + PDFBox if you are dealing with massive Excel files that cannot be loaded into memory.
分享:
扫描分享到社交APP
上一篇
下一篇