Of course! The load_workbook function is a cornerstone of the openpyxl library in Python, which is used for reading and writing Excel (.xlsx) files.

Here's a comprehensive guide covering everything from the basics to more advanced usage.
The Basics: What is load_workbook?
load_workbook() is a function from the openpyxl library that opens an existing Excel file from your disk and loads it into memory, allowing you to read its data, modify it, and then save it back.
Key things to know:
- File Format: It works with modern
.xlsxfiles, not the older.xlsformat. - Read-Only Mode: By default, it loads the file in a read-write mode. For better performance when you only need to read data, you can use
read_only=True. - Data Only Mode: If you only need the cell values and don't care about formatting (like colors, fonts, borders), you can use
data_only=True. This can significantly speed up loading.
Prerequisites: Installation
If you don't have openpyxl installed, you can install it using pip:

pip install openpyxl
Loading a Workbook: The Core Function
Let's start with the simplest example.
Assume you have an Excel file named sample.xlsx with the following content:
| A | B | C | |
|---|---|---|---|
| 1 | Product | Quantity | Price |
| 2 | Apple | 50 | 20 |
| 3 | Orange | 75 | 50 |
| 4 | Banana | 120 | 80 |
Example 1: Basic Loading
from openpyxl import load_workbook
# Specify the path to your Excel file
file_path = 'sample.xlsx'
# Load the workbook
# The 'data_only=True' argument is useful for reading calculated cell values
wb = load_workbook(filename=file_path, data_only=True)
print(f"Successfully loaded workbook: {wb}")
print(f"Workbook sheet names: {wb.sheetnames}")
# Don't forget to close the workbook when you're done
# This is good practice to free up resources.
wb.close()
Output:
Successfully loaded workbook: <openpyxl.workbook.workbook.Workbook object at 0x...>
Workbook sheet names: ['Sheet1']
Working with the Loaded Workbook
Once you've loaded the workbook, you'll typically want to access its sheets and then the cells within those sheets.

Example 2: Accessing Sheets and Reading Data
This example reads all the data from the first sheet.
from openpyxl import load_workbook
file_path = 'sample.xlsx'
wb = load_workbook(filename=file_path, data_only=True)
# --- Accessing Sheets ---
# Method 1: Get the sheet by its name (most common)
sheet = wb['Sheet1']
# Method 2: Get the first sheet (if you know it's the one you want)
# sheet = wb.active
# --- Reading Data ---
# Get the value of a single cell
product_a2 = sheet['A2']
print(f"Value of cell A2: {product_a2.value}")
# You can also use .cell() method, which is often more robust
# Note: .cell() uses 1-based indexing for rows and columns
quantity_a3 = sheet.cell(row=3, column=2)
print(f"Value of cell B3: {quantity_a3.value}")
# Reading all data from the sheet
print("\n--- Reading all data from the sheet ---")
# We'll iterate through rows. 'sheet.iter_rows()' is efficient.
for row in sheet.iter_rows(values_only=True):
# values_only=True gives you only the cell values, not the cell objects
print(row)
wb.close()
Output:
Value of cell A2: Apple
Value of cell B3: 75
--- Reading all data from the sheet ---
('Product', 'Quantity', 'Price')
('Apple', 50, 1.2)
('Orange', 75, 1.5)
('Banana', 120, 0.8)
Advanced Loading Options
Using read_only=True and data_only=True can make your script much faster and more memory-efficient, especially with large files.
Example 3: Using read_only=True
This mode is for reading only. You cannot modify or save the workbook in this mode. It streams the data, which is great for huge files that wouldn't fit in memory otherwise.
from openpyxl import load_workbook
file_path = 'sample.xlsx'
# Load the workbook in read-only mode
wb_read_only = load_workbook(filename=file_path, read_only=True)
sheet = wb_read_only.active
print("Reading from a read-only workbook:")
for row in sheet.iter_rows(values_only=True):
print(row)
# IMPORTANT: You MUST close a read-only workbook.
# It cannot be closed automatically with a 'with' statement.
wb_read_only.close()
Example 4: Using data_only=True
This mode loads the last calculated value of a cell. If a cell contains a formula (e.g., =SUM(A1:A10)), data_only=True will give you the result of that sum. If you omit it, you'll get the formula string itself (=SUM(A1:A10)).
from openpyxl import load_workbook
# Assume 'formula.xlsx' has a cell C5 with the formula =A2*B2
# where A2 is 10 and B2 is 5. The value in C5 should be 50.
file_path = 'formula.xlsx'
# Without data_only, you get the formula string
wb_formula = load_workbook(filename=file_path)
sheet = wb_formula.active
print(f"Cell C5 contains the formula: {sheet['C5'].value}")
wb_formula.close()
# With data_only, you get the calculated value
wb_value = load_workbook(filename=file_path, data_only=True)
sheet = wb_value.active
print(f"Cell C5 contains the calculated value: {sheet['C5'].value}")
wb_value.close()
Output:
Cell C5 contains the formula: =A2*B2
Cell C5 contains the calculated value: 50
Best Practice: The with Statement
Using a with statement is the recommended way to handle files in Python. It automatically ensures that your workbook is closed, even if errors occur during processing.
This works for standard loading but not for read_only=True mode.
from openpyxl import load_workbook
file_path = 'sample.xlsx'
try:
with load_workbook(filename=file_path, data_only=True) as wb:
sheet = wb.active
print(f"Sheet title: {sheet.title}")
# Do your work here...
# For example, print the value of cell C4
print(f"Value of C4: {sheet['C4'].value}")
# The workbook will be automatically closed when the 'with' block exits
print("Workbook processing complete.")
except FileNotFoundError:
print(f"Error: The file '{file_path}' was not found.")
except Exception as e:
print(f"An error occurred: {e}")
Summary of Key Parameters
| Parameter | Description | Example |
|---|---|---|
filename |
(Required) The path to the .xlsx file to load. |
load_workbook('data.xlsx') |
read_only |
If True, opens the file in read-only mode. Faster and uses less memory. Cannot be saved. |
load_workbook('data.xlsx', read_only=True) |
data_only |
If True, loads the calculated value of cells. If False (default), loads the formula string. |
load_workbook('data.xlsx', data_only=True) |
keep_vba |
If True, any VBA (macros) present in the workbook will be preserved. |
load_workbook('macros.xlsx', keep_vba=True) |
keep_links |
If True, external links will be preserved. |
load_workbook('linked.xlsx', keep_links=True) |
