杰瑞科技汇

Python ndarray如何高效统计元素出现次数?

Of course! Counting elements in a NumPy ndarray is a very common task. The method you use depends entirely on what you want to count.

Python ndarray如何高效统计元素出现次数?-图1
(图片来源网络,侵删)

Here’s a comprehensive guide covering the most frequent use cases, from simple to advanced.

Counting Elements that Meet a Condition

This is the most common scenario. You want to know how many elements in your array satisfy a specific condition (e.g., are greater than a certain value).

The primary tool for this is numpy.count_nonzero().

Why count_nonzero?

When you use a boolean condition (e.g., arr > 5), NumPy evaluates it to a new array of True and False values. In NumPy, True is treated as 1 and False is treated as 0. Therefore, count_nonzero simply counts the number of 1s, which is exactly the number of True values.

Python ndarray如何高效统计元素出现次数?-图2
(图片来源网络,侵删)

Example: Counting numbers greater than a threshold

import numpy as np
# Create a sample 2D array
data = np.array([
    [1, 8, 3, 9],
    [7, 2, 5, 4],
    [9, 0, 6, 1]
])
# Count how many numbers are greater than 5
count = np.count_nonzero(data > 5)
print(f"Array:\n{data}\n")
print(f"Numbers greater than 5: {count}")
# Expected output: 5 (the numbers are 8, 9, 7, 9, 6)

Example: Counting elements equal to a specific value

You can combine a condition with & (and) or (or). Remember to use & and instead of Python's and and or when working with NumPy arrays.

import numpy as np
arr = np.array([1, 2, 3, 2, 4, 2, 5])
# Count how many times the number 2 appears
count_of_twos = np.count_nonzero(arr == 2)
print(f"Number of 2s: {count_of_twos}") # Output: 3
# Count how many times 1 or 5 appears
count_of_ones_or_fives = np.count_nonzero((arr == 1) | (arr == 5))
print(f"Number of 1s or 5s: {count_of_ones_or_fives}") # Output: 2

Counting Occurrences of Each Unique Value

If you want to get a frequency count of every unique element in the array, you should use numpy.unique() with the return_counts=True argument.

Example: Getting a full frequency distribution

import numpy as np
grades = np.array(['A', 'B', 'A', 'C', 'B', 'A', 'F', 'A', 'C'])
# Get unique values and their counts
unique_values, counts = np.unique(grades, return_counts=True)
# The result is two arrays: one with the unique items and one with their counts
print(f"Unique values: {unique_values}")
print(f"Counts:        {counts}")
# You can easily zip them together into a dictionary for easy reading
grade_distribution = dict(zip(unique_values, counts))
print(f"\nGrade distribution: {grade_distribution}")
# Expected output: {'A': 4, 'B': 2, 'C': 2, 'F': 1}

Counting Non-Zero Elements

Sometimes you just want to know how many elements in the array are not zero. This is a special case of counting a condition (arr != 0), but np.count_nonzero is perfect for this.

import numpy as np
arr = np.array([0, 1, 2, 0, 3, 0, 4, 5, 0])
# Count non-zero elements
non_zero_count = np.count_nonzero(arr)
print(f"Array: {arr}")
print(f"Number of non-zero elements: {non_zero_count}") # Output: 4

Counting Along an Axis (Rows or Columns)

When you have a multi-dimensional array, you might want to count elements along a specific axis.

Python ndarray如何高效统计元素出现次数?-图3
(图片来源网络,侵删)
  • axis=0: Count down the columns (for each column, count elements that meet the condition).
  • axis=1: Count across the rows (for each row, count elements that meet the condition).

Example: Counting values in each row

import numpy as np
data = np.array([
    [1, 8, 3, 9],  # Row 0
    [7, 2, 5, 4],  # Row 1
    [9, 0, 6, 1]   # Row 2
])
# For each row, count how many numbers are greater than 5
# This will return an array with 3 elements (one for each row)
counts_per_row = np.count_nonzero(data > 5, axis=1)
print(f"Array:\n{data}\n")
print(f"Counts per row (values > 5): {counts_per_row}")
# Expected output: [2 1 1] (Row 0 has 2, Row 1 has 1, Row 2 has 1)

Example: Counting values in each column

# For each column, count how many numbers are greater than 5
# This will return an array with 4 elements (one for each column)
counts_per_col = np.count_nonzero(data > 5, axis=0)
print(f"Array:\n{data}\n")
print(f"Counts per column (values > 5): {counts_per_col}")
# Expected output: [1 1 1 2] (Col 0 has 1, Col 1 has 1, etc.)

Summary Table: Which Method to Use?

Goal Recommended Method Example
Count elements that meet a condition (e.g., > 5, == 'A') np.count_nonzero(condition) np.count_nonzero(arr > 5)
Count non-zero elements np.count_nonzero(arr) np.count_nonzero(arr)
Get a frequency count of all unique values in the array np.unique(arr, return_counts=True) vals, counts = np.unique(arr, return_counts=True)
Count elements along a specific axis (rows or columns) np.count_nonzero(condition, axis=0 or 1) np.count_nonzero(arr > 5, axis=1)
分享:
扫描分享到社交APP
上一篇
下一篇