Of course! The term "python wdformatpdf" is a bit ambiguous, as it could refer to several different tasks. I'll cover the most common interpretations and provide the best Python solutions for each.

Interpretation 1: Convert a Word Document (.docx) to a Formatted PDF
This is the most likely meaning. You want to take a Word file and convert it to a PDF, preserving the formatting (fonts, layout, images, etc.).
For this, the most popular and reliable library is python-docx to read the Word document and reportlab to generate the PDF. However, a more modern and often simpler approach is to use a library that can automate a real Word application (like Microsoft Word on Windows or LibreOffice on Linux/macOS).
Here are the best methods:
Method 1: Using docx2pdf (The Easiest Way)
This library is fantastic because it acts as a wrapper for Microsoft Word or LibreOffice, meaning it uses the actual application to do the conversion. This ensures the formatting is preserved perfectly.

Installation:
You'll need to have either Microsoft Word or LibreOffice installed on your system.
pip install docx2pdf
Usage:
The library is very straightforward.
from docx2pdf import convert
# --- Convert a single file ---
# Converts 'my_report.docx' to 'my_report.pdf' in the same directory
convert("my_report.docx")
# --- Convert a file to a different output directory ---
# Converts 'input.docx' and saves the result as 'output.pdf' in the 'pdfs' folder
convert("input.docx", "pdfs/output.pdf")
# --- Convert all .docx files in a directory ---
# Converts every .docx file in the 'word_docs' folder to the 'pdf_output' folder
convert("word_docs/", "pdf_output/")
Pros:
- Extremely easy to use.
- High-fidelity conversion because it uses a real word processor.
- Handles complex layouts, headers, footers, and images well.
Cons:
- Requires a heavy dependency (Microsoft Word or LibreOffice) to be installed on the machine.
- Can be slower as it has to launch the external application.
Method 2: Using python-docx and reportlab (The "Pure Python" Way)
This method doesn't require an external office suite. It reads the Word document structure with python-docx and then draws the content onto a PDF canvas using reportlab. This method is more complex and does not perfectly replicate all Word formatting.
Installation:
pip install python-docx reportlab
Usage:
This is a simplified example. A full converter would be very complex.
from docx import Document
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas
from reportlab.lib.units import inch
def convert_docx_to_pdf_simple(docx_path, pdf_path):
# Load the .docx file
doc = Document(docx_path)
# Create a PDF canvas
c = canvas.Canvas(pdf_path, pagesize=letter)
width, height = letter
# Simple layout variables
y = height - inch # Start near the top of the page
text_object = c.beginText(72, y) # 72 points = 1 inch from the left
text_object.setFont("Helvetica", 12) # Default font, Word fonts are not mapped
for paragraph in doc.paragraphs:
# This is a very basic implementation. It doesn't handle:
# - Different fonts, sizes, or colors
# - Bold, italic, or underline
# - Lists
# - Images
# - Tables
# - Page breaks
# - Text wrapping
text_object.textLine(paragraph.text)
y -= 15 # Move down for the next line
# Draw the text onto the canvas
c.drawText(text_object)
c.save()
# --- Example Usage ---
convert_docx_to_pdf_simple("my_report.docx", "my_report_simple.pdf")
print("Simple PDF created.")
Pros:
- No external dependencies like Microsoft Word.
- Lightweight and fast.
Cons:
- Crucially: It does not preserve formatting. Fonts, styles, and layouts will be lost or simplified.
- Very complex to implement correctly for a real-world document.
Interpretation 2: Format a PDF File (e.g., add text, fill forms)
If you mean you want to modify or format an existing PDF file (like filling out a form or adding a watermark), you need a different set of tools.
The best library for this is PyPDF2.
Installation:
pip install PyPDF2
Usage Example: Creating a Watermarked PDF
This example reads an existing PDF, adds a "DRAFT" watermark to every page, and saves a new PDF.
from PyPDF2 import PdfReader, PdfWriter
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
import io
def add_watermark(input_pdf_path, output_pdf_path, watermark_text="DRAFT"):
# 1. Create a PDF in memory for the watermark
packet = io.BytesIO()
can = canvas.Canvas(packet, pagesize=letter)
# Set the font and color for the watermark
can.setFont("Helvetica", 40)
can.setFillColorRGB(0.9, 0.9, 0.9, alpha=0.5) # Light grey, semi-transparent
# Draw the text diagonally across the page
can.saveState()
can.translate(100, 100)
can.rotate(45)
can.drawString(0, 0, watermark_text)
can.restoreState()
can.save()
packet.seek(0)
# 2. Create a PDF reader for the original file
watermark_pdf = PdfReader(packet)
original_pdf = PdfReader(input_pdf_path)
writer = PdfWriter()
# 3. Merge the watermark with each page of the original
for page_num in range(len(original_pdf.pages)):
original_page = original_pdf.pages[page_num]
watermark_page = watermark_pdf.pages[0]
# Merge the watermark onto the original page
original_page.merge_page(watermark_page)
# Add the merged page to the writer
writer.add_page(original_page)
# 4. Write the result to a new file
with open(output_pdf_path, "wb") as output_file:
writer.write(output_file)
# --- Example Usage ---
add_watermark("my_report.pdf", "my_report_watermarked.pdf")
print("Watermarked PDF created.")
Summary and Recommendation
| Task | Recommended Library | Ease of Use | Formatting Quality | Dependencies |
|---|---|---|---|---|
Convert .docx to .pdf (Preserve formatting) |
docx2pdf |
★★★★★ (Perfect) | Microsoft Word or LibreOffice | |
Convert .docx to .pdf (Simple, no external app) |
python-docx + reportlab |
★☆☆☆☆ (Poor) | None | |
Modify / Format an existing .pdf |
PyPDF2 |
★★★★☆ (Good) | None |
For your request, docx2pdf is almost certainly the tool you are looking for. It's the most direct and reliable way to achieve a high-quality Word-to-PDF conversion using Python.
