Python如何用xlrd读取Linux下的Excel文件？-杰瑞科技汇

Of course! Here's a comprehensive guide to using the xlrd library in Python on a Linux system.

（图片来源网络，侵删）

Overview

xlrd is a Python library for reading data and formatting information from Excel files (.xls and .xlsx). It's a fast and efficient library, but it's important to note its main limitation:

xlrd can only READ .xls files (Excel 97-2003) and .xlsx files (Excel 2007+) in "read-only" mode. It cannot create or modify Excel files.

For writing to Excel files, you would typically use a library like openpyxl or xlsxwriter.

Installation

First, you need to have Python and pip (Python's package installer) installed on your Linux system. If you don't, you can usually install them with your distribution's package manager (e.g., sudo apt-get install python3 python3-pip for Debian/Ubuntu).

（图片来源网络，侵删）

Then, install xlrd using pip:

# It's best practice to use pip3 for Python 3
pip3 install xlrd

If you need to install it for a specific virtual environment, activate the environment first:

source /path/to/your/venv/bin/activate
pip install xlrd

Basic Usage: Reading an Excel File

Let's assume you have an Excel file named data.xlsx with the following content in the first sheet:

A	B	C
1	Name	Age	City
2	Alice	30	New York
3	Bob	25	London
4	Charlie	35	Paris

Here is the Python code to read this file.

import xlrd
# Define the path to your Excel file
file_path = 'data.xlsx'
try:
    # Open the workbook
    # The 'on_demand=True' parameter can be used to load only specific sheets,
    # which is memory-efficient for large files.
    workbook = xlrd.open_workbook(file_path, on_demand=False)
    # Get the first sheet (by index or name)
    # sheet = workbook.sheet_by_index(0)
    sheet = workbook.sheet_by_name('Sheet1') # Make sure the sheet name is correct
    print(f"Sheet Name: {sheet.name}")
    print(f"Number of Rows: {sheet.nrows}")
    print(f"Number of Columns: {sheet.ncols}")
    print("-" * 20)
    # --- Method 1: Iterate through all rows and cells ---
    print("Iterating through all rows:")
    for row_idx in range(sheet.nrows):
        # Each row is a list of cell values
        row = sheet.row_values(row_idx)
        print(f"Row {row_idx}: {row}")
    print("-" * 20)
    # --- Method 2: Access specific cells by coordinates (row, col) ---
    # Note: Row and column indices are 0-based.
    print("Accessing specific cells:")
    # Get the value from cell A2 (row 1, col 0)
    name = sheet.cell_value(1, 0)
    print(f"Name from A2: {name}")
    # Get the value from cell B3 (row 2, col 1)
    age = sheet.cell_value(2, 1)
    print(f"Age from B3: {age}")
    # --- Method 3: Get a full row or column as a list ---
    print("\nGetting full rows and columns:")
    header_row = sheet.row_values(0)
    print(f"Header Row: {header_row}")
    # Get all values from the 'City' column (column index 2)
    city_column = sheet.col_values(2)
    print(f"City Column: {city_column}")
except xlrd.XLRDError as e:
    print(f"Error opening or reading the Excel file: {e}")
except FileNotFoundError:
    print(f"Error: The file '{file_path}' was not found.")

Explanation of Key Methods:

xlrd.open_workbook(file_path): Opens the Excel file.
workbook.sheet_by_index(0): Gets a sheet by its position (0 is the first sheet).
workbook.sheet_by_name('Sheet1'): Gets a sheet by its name.
sheet.nrows: The total number of rows in the sheet.
sheet.ncols: The total number of columns in the sheet.
sheet.row_values(row_idx): Returns all cell values in a specific row as a list.
sheet.col_values(col_idx): Returns all cell values in a specific column as a list.
sheet.cell_value(row_idx, col_idx): Returns the value of a single cell.

Handling Cell Types

Excel stores different types of data (text, numbers, dates, etc.). xlrd provides a way to check the type of a cell.

xlrd.XL_CELL_TEXT (0)
xlrd.XL_CELL_NUMBER (1)
xlrd.XL_CELL_DATE (3)
xlrd.XL_CELL_BOOLEAN (4)
xlrd.XL_CELL_BLANK (5)

Important: Dates are tricky. xlrd stores them as a floating-point number (the number of days since 1900-01-01). You need to convert it to a Python datetime object.

Here's an example of handling different cell types:

import xlrd
from datetime import datetime, date
file_path = 'data_with_types.xlsx' # Assume this file has mixed data types
workbook = xlrd.open_workbook(file_path)
sheet = workbook.sheet_by_index(0)
# Iterate through rows, skipping the header
for row_idx in range(1, sheet.nrows):
    for col_idx in range(sheet.ncols):
        cell_value = sheet.cell_value(row_idx, col_idx)
        cell_type = sheet.cell_type(row_idx, col_idx)
        if cell_type == xlrd.XL_CELL_TEXT:
            print(f"Cell ({row_idx},{col_idx}): Text -> '{cell_value}'")
        elif cell_type == xlrd.XL_CELL_NUMBER:
            print(f"Cell ({row_idx},{col_idx}): Number -> {cell_value}")
        elif cell_type == xlrd.XL_CELL_DATE:
            # Convert Excel date number to a Python datetime object
            # The `xlrd.xldate_as_datetime` function is very helpful
            date_obj = xlrd.xldate_as_datetime(cell_value, workbook.datemode)
            print(f"Cell ({row_idx},{col_idx}): Date -> {date_obj} (Formatted: {date_obj.strftime('%Y-%m-%d')})")
        elif cell_type == xlrd.XL_CELL_BOOLEAN:
            print(f"Cell ({row_idx},{col_idx}): Boolean -> {bool(cell_value)}")
        elif cell_type == xlrd.XL_CELL_BLANK:
            print(f"Cell ({row_idx},{col_idx}): Blank")
        print("-")

Common Pitfalls and Best Practices

xlrd Cannot Write to .xlsx Files: This is the most common mistake. If you try to use xlrd to create or modify an .xlsx file, it will fail silently or raise an error. Use openpyxl for writing.
Date Handling: As shown above, dates are not handled out-of-the-box. You must use xlrd.xldate_as_datetime() to convert them correctly.
File Not Found: Always wrap your file operations in a try...except FileNotFoundError block to handle cases where the Excel file doesn't exist.
Sheet Name is Case-Sensitive: sheet_by_name('Sheet1') is different from sheet_by_name('sheet1'). Be precise with the name.
Large Files: For very large Excel files, loading the entire workbook into memory can be slow. Use the on_demand=True parameter when opening the workbook. This allows you to load sheets one by one as needed, saving memory.

# Memory-efficient way for large files
workbook = xlrd.open_workbook('huge_file.xlsx', on_demand=True)
# You can't get all sheet names immediately with on_demand=True
# You have to access them by index first
sheet = workbook.sheet_by_index(0)
print(f"Loaded sheet: {sheet.name}")
# ... do work on the sheet ...
# When you're done with a sheet, you can release its memory
workbook.unload_sheet(0)

Alternatives to `xlrd`

While xlrd is great for reading, the modern Python ecosystem offers other powerful libraries.

Library	Can Read `.xls`?	Can Read `.xlsx`?	Can Write?	Notes
`xlrd`	✅	✅	❌	Fast and lightweight for reading. The classic choice.
`openpyxl`	❌	✅	✅	The modern standard. Slower than `xlrd` for reading but excellent for both reading and writing `.xlsx` files.
`pandas`	✅	✅	✅	The best choice for data analysis. Uses `xlrd` or `openpyxl` under the hood. `df = pd.read_excel('file.xlsx')` is all you need.

Example using pandas:

import pandas as pd
# This is the easiest way to read an Excel file into a data structure
# pandas will automatically use xlrd or openpyxl as needed.
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
print("DataFrame from Excel:")
print(df)
# Accessing data is very intuitive
print("\nNames column:")
print(df['Name'])

Summary

Use xlrd when you only need to read data from Excel files and performance is a critical concern.
Use openpyxl when you need to both read and write .xlsx files.
Use pandas when your goal is data analysis, manipulation, or scientific computing. It's the most convenient and powerful tool for this purpose.

Python如何用xlrd读取Linux下的Excel文件？

Overview

Installation

Basic Usage: Reading an Excel File

Explanation of Key Methods:

Handling Cell Types

Common Pitfalls and Best Practices

Alternatives to `xlrd`

Summary

99ANYc3cd6

iapp post教程

Python split如何同时处理多个分隔符？

PHP Python该学哪个版本？

Python如何设置编码为GBK？

Scratch中文教程怎么学？

Java Socket编程实例具体怎么实现？

Java日期怎么转时间戳？

Java中not equals的正确写法是什么？

Python OpenCV如何高效读取视频帧？

Java action 方法如何定义与调用？

PS3自制系统怎么刷？教程步骤详解？

java swing 可视化

Java getsize方法如何使用？

如何在RapidMiner中使用Python？

Perl与Python正则表达式怎么选？

ExtJS4.2教程怎么学？从哪开始学？

Python如何用xlrd读取Linux下的Excel文件？

Overview

Installation

Basic Usage: Reading an Excel File

Explanation of Key Methods:

Handling Cell Types

Common Pitfalls and Best Practices

Alternatives to xlrd

Summary

相关推荐

Java Socket编程实例具体怎么实现？

Alternatives to `xlrd`