杰瑞科技汇

Python DataFrame 的 len() 用法详解?

Of course! In Python, when working with the popular pandas library, the term "len" on a DataFrame refers to its length. The length of a DataFrame is the number of rows it contains.

Python DataFrame 的 len() 用法详解?-图1
(图片来源网络,侵删)

Here’s a complete guide covering the primary ways to get the length of a DataFrame, along with explanations and examples.

The Short Answer: len(df)

The most common and Pythonic way to get the number of rows in a DataFrame is to use the built-in len() function.

import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 28],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)
# Get the number of rows
number_of_rows = len(df)
print(f"The DataFrame has {number_of_rows} rows.")
print(f"Using len(df): {len(df)}")

Output:

The DataFrame has 4 rows.
Using len(df): 4

Detailed Explanation and Other Methods

While len(df) is the most common, it's good to know what's happening under the hood and what other options are available.

Python DataFrame 的 len() 用法详解?-图2
(图片来源网络,侵删)

len(df) (Recommended)

This is the preferred method for its simplicity and readability. It directly calls the __len__() special method of the DataFrame object, which is optimized to return the number of rows.

Pros:

  • Readable: len(df) clearly states its intent.
  • Fast: It's a highly optimized operation.
  • Pythonic: It uses a standard Python built-in function.

df.shape

The .shape attribute of a DataFrame returns a tuple representing the dimensions of the DataFrame: (number_of_rows, number_of_columns).

To get just the number of rows, you need to access the first element of the tuple (at index 0).

Python DataFrame 的 len() 用法详解?-图3
(图片来源网络,侵删)
# Get the dimensions (rows, columns)
dimensions = df.shape
print(f"df.shape returns: {dimensions}")
# Get the number of rows from the shape
rows_from_shape = df.shape[0]
print(f"Number of rows from df.shape[0]: {rows_from_shape}")

Output:

df.shape returns: (4, 3)
Number of rows from df.shape[0]: 4

Pros:

  • Useful when you need both dimensions: If you also need the number of columns, this is the most efficient way to get both at once.
  • Unambiguous: There is no confusion about what it returns.

Cons:

  • Slightly more verbose: df.shape[0] is a bit longer than len(df) if you only need the row count.

len(df.index)

A DataFrame has two main indexable components: its index (for rows) and its columns. You can explicitly get the length of the row index.

# Get the length of the DataFrame's index
rows_from_index = len(df.index)
print(f"Number of rows from len(df.index): {rows_from_index}")

Output:

Number of rows from len(df.index): 4

Pros:

  • Very explicit: It clearly states that you are measuring the row axis.

Cons:

  • Redundant: It's functionally identical to len(df) but less concise.

Comparison Table

Method What it Returns When to Use
len(df) int (number of rows) Best for most cases. When you only need the row count. It's the most readable and Pythonic.
df.shape[0] int (number of rows) When you need both the number of rows and columns (df.shape), or if you prefer to be very explicit about the axis.
len(df.index) int (number of rows) When you want to be extremely explicit that you are measuring the length of the row index. Less common than len(df).

Important Consideration: Empty DataFrames

What happens if your DataFrame is empty? All the methods above handle this gracefully and correctly return 0.

# Create an empty DataFrame
empty_df = pd.DataFrame()
print(f"len(empty_df): {len(empty_df)}")
print(f"empty_df.shape: {empty_df.shape}")
print(f"len(empty_df.index): {len(empty_df.index)}")

Output:

len(empty_df): 0
empty_df.shape: (0, 0)
len(empty_df.index): 0

What About the Number of Columns?

If you're looking for the number of columns, you can use df.shape[1] or len(df.columns).

# Get the number of columns
num_cols_shape = df.shape[1]
num_cols_cols = len(df.columns)
print(f"Number of columns (from shape): {num_cols_shape}")
print(f"Number of columns (from columns): {num_cols_cols}")

Output:

Number of columns (from shape): 3
Number of columns (from columns): 3
分享:
扫描分享到社交APP
上一篇
下一篇