杰瑞科技汇

Python中AccuracyScore如何正确计算?

Of course! Let's dive deep into accuracy_score from Python's popular scikit-learn library.

Python中AccuracyScore如何正确计算?-图1
(图片来源网络,侵删)

What is accuracy_score?

accuracy_score is a function used to evaluate the performance of a classification model. It measures the fraction of correct predictions made by the model compared to the total number of predictions.

In simple terms, it answers the question: "Out of all the predictions, what percentage did the model get right?"


The Formula

The formula for accuracy is straightforward:

$$ \text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}} = \frac{TP + TN}{TP + TN + FP + FN} $$

Python中AccuracyScore如何正确计算?-图2
(图片来源网络,侵删)

Where:

  • TP (True Positives): The model correctly predicted the positive class.
  • TN (True Negatives): The model correctly predicted the negative class.
  • FP (False Positives): The model incorrectly predicted the positive class (Type I Error).
  • FN (False Negatives): The model incorrectly predicted the negative class (Type II Error).

How to Use It (Code Examples)

First, you need to have scikit-learn installed. If you don't, run this in your terminal:

pip install scikit-learn

Example 1: Basic Usage

This is the simplest example where you provide the true labels and the model's predicted labels.

from sklearn.metrics import accuracy_score
# The actual, correct labels
y_true = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0]
# The labels predicted by our model
y_pred = [1, 0, 1, 0, 0, 1, 1, 0, 1, 0]
# Calculate accuracy
accuracy = accuracy_score(y_true, y_pred)
print(f"True Labels:  {y_true}")
print(f"Predicted Labels: {y_pred}")
print(f"Accuracy: {accuracy:.2f}") # Format to 2 decimal places
# Output: Accuracy: 0.80

Explanation:

Python中AccuracyScore如何正确计算?-图3
(图片来源网络,侵删)
  • Total predictions = 10
  • Correct predictions = 8 (indices 0, 1, 2, 4, 5, 7, 8, 9)
  • Incorrect predictions = 2 (indices 3 and 6)
  • Accuracy = 8 / 10 = 0.80 or 80%.

Example 2: With a Real Model (e.g., Logistic Regression)

This is a more realistic workflow where you train a model and then evaluate it.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# 1. Generate a sample dataset
X, y = make_classification(
    n_samples=1000,      # 1000 data points
    n_features=20,       # 20 features
    n_informative=10,    # 10 useful features
    n_redundant=5,       # 5 redundant features
    n_classes=2,         # 2 classes (0 and 1)
    random_state=42      # for reproducibility
)
# 2. Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# 3. Initialize and train a model
model = LogisticRegression()
model.fit(X_train, y_train)
# 4. Make predictions on the test set
y_pred = model.predict(X_test)
# 5. Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy of the Logistic Regression model: {accuracy:.4f}")
# Output might be: Accuracy of the Logistic Regression model: 0.8567

Important Considerations and Limitations

While accuracy is easy to understand, it can be misleading in certain situations. Here’s when you should be cautious:

Imbalanced Datasets

This is the biggest drawback of accuracy. If one class is much more common than the other, a model can achieve high accuracy simply by always predicting the majority class.

Example: Imagine a medical test for a rare disease that affects only 1% of the population.

import numpy as np
from sklearn.metrics import accuracy_score
# True labels: 99 are healthy (0), 1 is sick (1)
y_true = np.array([0] * 99 + [1] * 1)
# A "dumb" model that always predicts "healthy" (0)
y_pred = np.array([0] * 100)
# Calculate accuracy
accuracy = accuracy_score(y_true, y_pred)
print(f"Accuracy of the 'always predict healthy' model: {accuracy:.2f}")
# Output: Accuracy of the 'always predict healthy' model: 0.99

Conclusion: The model has 99% accuracy, but it's completely useless because it failed to identify the single sick person. In this case, you should use other metrics like Precision, Recall, F1-Score, or the AUC-ROC curve.

Multi-Class Classification

accuracy_score works perfectly for multi-class problems as well. It still just calculates the total number of correct predictions over the total number of predictions.

from sklearn.metrics import accuracy_score
y_true = ['cat', 'dog', 'bird', 'cat', 'dog']
y_pred = ['cat', 'dog', 'cat', 'cat', 'dog']
accuracy = accuracy_score(y_true, y_pred)
print(f"Accuracy: {accuracy:.2f}")
# Output: Accuracy: 0.80 (4 out of 5 are correct)

Parameters of accuracy_score

The function has a few useful parameters:

  • y_true: 1D array-like. The ground truth (correct) labels.

  • y_pred: 1D array-like. The predicted labels returned by the classifier.

  • normalize (Default: True):

    • If True (default), returns the fraction of correct predictions (a float between 0.0 and 1.0).
    • If False, returns the number of correct predictions (an integer).
      y_true = [1, 0, 1, 1, 0]
      y_pred = [1, 0, 1, 0, 0]

    print(accuracy_score(y_true, y_pred, normalize=True)) # Output: 0.8 print(accuracy_score(y_true, y_pred, normalize=False)) # Output: 4

  • sample_weight: 1D array-like. Weights for each sample. This allows you to give more importance to certain predictions when calculating the score.


When to Use accuracy_score

  • Your classes are balanced: When each class has a similar number of samples.
  • You need a simple, intuitive metric: It's the easiest way to explain model performance to a non-technical audience.
  • All types of errors are equally important: When misclassifying a "cat" as a "dog" is just as bad as misclassifying a "dog" as a "cat".

When to Avoid accuracy_score

  • Your classes are imbalanced: As shown in the medical example, it can be very deceptive.
  • The cost of false positives and false negatives is different: For example, in spam detection, a false positive (marking a real email as spam) is usually much worse than a false negative (letting a spam email into the inbox). Here, Precision and Recall are more appropriate.
分享:
扫描分享到社交APP
上一篇
下一篇