Python groupby level如何按层级分组聚合？-杰瑞科技汇

Of course! The term "groupby level" in Python almost exclusively refers to grouping by the levels of a MultiIndex (also known as a hierarchical index) using the powerful Pandas library.

（图片来源网络，侵删）

Let's break it down from the basics to advanced usage.

What is a MultiIndex?

A MultiIndex is a way of having multiple index levels on a single axis (either rows or columns). It's like having a spreadsheet with grouped rows or columns.

Imagine you have sales data for different products across several cities and years. A MultiIndex can organize this beautifully:

Sales Data
├── Product: 'A'
│   ├── City: 'New York'
│   │   └── Year: 2025 -> 100
│   │   └── Year: 2025 -> 120
│   └── City: 'London'
│       └── Year: 2025 -> 80
│       └── Year: 2025 -> 90
└── Product: 'B'
    ├── City: 'New York'
    │   └── Year: 2025 -> 150
    │   └── Year: 2025 -> 170
    └── City: 'London'
        └── Year: 2025 -> 110
        └── Year: 2025 -> 125

In Pandas, this would look like:

（图片来源网络，侵删）

import pandas as pd
import numpy as np
# Create a MultiIndex
index = pd.MultiIndex.from_product([
    ['Product A', 'Product B'],
    ['New York', 'London'],
    [2025, 2025]
], names=['Product', 'City', 'Year'])
# Create a DataFrame with the MultiIndex
data = np.random.randint(50, 200, size=8)
df = pd.DataFrame(data, index=index, columns=['Sales'])
print(df)

Output:

                     Sales
Product   City    Year     
Product A New York 2025   152
                 2025   178
          London   2025   101
                 2025   190
Product B New York 2025   119
                 2025   164
          London   2025   188
                 2025   134

The `level` Parameter in `groupby()`

When you call .groupby() on a DataFrame or Series with a MultiIndex, you need to tell Pandas which level(s) of the index to group by. This is what the level parameter is for.

Key 1: Grouping by a Single Level

You can group by a specific level using its name or its integer position (0 for the first level, 1 for the second, etc.).

Example 1: Group by Level Name (`'Product'`)

Let's group by the 'Product' level to get the total sales for each product across all cities and years.

（图片来源网络，侵删）

# Group by the 'Product' level and sum the sales
sales_by_product = df.groupby(level='Product').sum()
print(sales_by_product)

Output:

          Sales
Product         
Product A   621
Product B   605

Example 2: Group by Level Position (`level=1`)

Now, let's group by the second level, which is 'City' (index position 1).

# Group by the second level (index=1) and sum the sales
sales_by_city = df.groupby(level=1).sum()
print(sales_by_city)

Output:

             Sales
City              
London         513
New York       713

Key 2: Grouping by Multiple Levels

You can also group by more than one level at a time. To do this, pass a list of level names or positions to the level parameter.

Example: Group by 'Product' and 'City'

Let's get the total sales for each product in each city.

# Group by 'Product' and 'City' and sum the sales
sales_by_product_city = df.groupby(level=['Product', 'City']).sum()
print(sales_by_product_city)

Output:

                      Sales
Product   City             
Product A London         291
          New York       330
Product B London         322
          New York       283

Notice that the result also has a MultiIndex, reflecting the levels we grouped by.

Key 3: Grouping by All Levels

If you want to group by every level in the index, you can pass the special argument level=df.index.names. This effectively groups each unique combination of index values together, which is equivalent to not grouping at all in terms of the groups formed, but it's useful if you want to apply an aggregation function to each "cell" defined by the MultiIndex.

# Group by all levels in the index
# This will create a group for each unique row
grouped_by_all = df.groupby(level=df.index.names).sum()
print(grouped_by_all)

Output:

                      Sales
Product   City    Year     
Product A London   2025   101
                 2025   190
          New York 2025   152
                 2025   178
Product B London   2025   188
                 2025   134
          New York 2025   119
                 2025   164

This is functionally the same as the original df because sum() on a single number returns the number itself. The real power comes when you use this in combination with .agg().

Key 4: Grouping by Level with `.agg()` for Multiple Aggregations

This is where groupby(level) becomes incredibly powerful. You can perform different aggregation functions on different columns.

Let's add a 'Profit' column to our DataFrame to demonstrate.

df['Profit'] = np.random.randint(10, 50, size=8)
print(df)

Output:

                     Sales  Profit
Product   City    Year             
Product A New York 2025   152      23
                 2025   178      18
          London   2025   101      42
                 2025   190      14
Product B New York 2025   119      49
                 2025   164      11
          London   2025   188      46
                 2025   134      32

Now, let's group by 'City' and calculate the total 'Sales' and the average 'Profit' for each city.

# Group by 'City' and aggregate Sales with sum and Profit with mean
city_summary = df.groupby(level='City').agg({
    'Sales': 'sum',
    'Profit': 'mean'
})
print(city_summary)

Output:

             Sales  Profit
City                      
London         513  33.500
New York       713  25.250

Key 5: Grouping by Level on Columns

Everything we've done so far applies to column levels as well. You just need to use the axis=1 parameter.

Let's create a DataFrame with a MultiIndex for columns.

# Create a DataFrame with a MultiIndex for columns
columns = pd.MultiIndex.from_product([
    ['Q1', 'Q2'],
    ['Revenue', 'Expenses']
], names=['Quarter', 'Metric'])
data_col = {
    'Product A': [100, 50, 120, 60],
    'Product B': [150, 70, 170, 80]
}
df_col = pd.DataFrame(data_col, index=columns)
print(df_col)

Output:

Metric      Expenses  Revenue
Quarter                        
Q1            Product A     50      100
                      Product B     70      150
Q2            Product A     60      120
                      Product B     80      170

Now, let's group by the 'Metric' level of the columns and calculate the total for each metric across all products and quarters.

# Group by the 'Metric' level of the columns (axis=1)
# and sum the values
total_by_metric = df_col.groupby(axis=1, level='Metric').sum()
print(total_by_metric)

Output:

Metric   Expenses  Revenue
Product A      110      220
Product B      150      320

Summary

Task	Code Example
Group by single level (name)	`df.groupby(level='Product').sum()`
Group by single level (position)	`df.groupby(level=1).sum()`
Group by multiple levels	`df.groupby(level=['Product', 'City']).sum()`
Group by all levels	`df.groupby(level=df.index.names).sum()`
Aggregate multiple columns by level	`df.groupby(level='City').agg({'Sales':'

Python groupby level如何按层级分组聚合？

What is a MultiIndex?

The `level` Parameter in `groupby()`

Key 1: Grouping by a Single Level

Example 1: Group by Level Name (`'Product'`)

Example 2: Group by Level Position (`level=1`)

Key 2: Grouping by Multiple Levels

Example: Group by 'Product' and 'City'

Key 3: Grouping by All Levels

Key 4: Grouping by Level with `.agg()` for Multiple Aggregations

Key 5: Grouping by Level on Columns

Summary

99ANYc3cd6

Lex与Yacc如何快速入门？

Spark中如何用Python高效解析与处理JSON数据？

Python列表如何统一转大写？

Java FTP Socket如何实现文件传输？

请问Java Web开发中，Spring框架到底有什么用？

Java Socket编程实例具体怎么实现？

String的getBytes()方法字符集转换如何用？

Thinking in Java英文版，如何高效学习？

delicious retouch教程是什么？

Cool Edit 2.1视频教程，如何快速上手？

Java String的hashCode为何特殊？

Visual FoxPro程序设计教程该怎么学？

Adobe InDesign安装教程在哪找？

Ubuntu 14.04如何安装Java？

reactor python epoll

Photoshop教程从哪开始学？新手必看技巧有哪些？

Python groupby level如何按层级分组聚合？

What is a MultiIndex?

The level Parameter in groupby()

Key 1: Grouping by a Single Level

Example 1: Group by Level Name ('Product')

Example 2: Group by Level Position (level=1)

Key 2: Grouping by Multiple Levels

Example: Group by 'Product' and 'City'

Key 3: Grouping by All Levels

Key 4: Grouping by Level with .agg() for Multiple Aggregations

Key 5: Grouping by Level on Columns

Summary

相关推荐

Java Socket编程实例具体怎么实现？

The `level` Parameter in `groupby()`

Example 1: Group by Level Name (`'Product'`)

Example 2: Group by Level Position (`level=1`)

Key 4: Grouping by Level with `.agg()` for Multiple Aggregations