杰瑞科技汇

Python最大似然估计如何实现与应用?

Of course! Let's break down Maximum Likelihood Estimation (MLE) in Python. We'll cover the core concept, a manual implementation from scratch to understand the mechanics, and then show how to use popular libraries like SciPy and Statsmodels for practical applications.

Python最大似然估计如何实现与应用?-图1
(图片来源网络,侵删)

What is Maximum Likelihood Estimation (MLE)?

At its heart, MLE is a method for estimating the parameters of a statistical model.

The core idea is to find the parameter values that make the observed data most probable.

  • Likelihood (L): A function of the parameters, given the data. It tells us how "likely" it is that we would observe our specific data if the model's parameters were a certain value. It's not a probability distribution for the parameters themselves.
  • Goal: Find the parameters (e.g., mean and standard deviation for a normal distribution) that maximize the likelihood function.

Analogy: Imagine you have a coin, but you don't know if it's fair. You flip it 10 times and get 7 heads.

  • Parameter: The probability of heads, p.
  • Data: [H, H, T, H, H, H, T, H, T, T] (7 heads, 3 tails).
  • Likelihood: How likely is it to see this data if p=0.5? How likely is it if p=0.7? How about p=0.9?
  • MLE: The value of p that makes the observed 7 heads most likely is our best estimate. In this case, it would be p = 0.7.

MLE from Scratch: Estimating the Mean and Std. Dev. of a Normal Distribution

Let's implement MLE for a common problem: estimating the mean () and standard deviation () of a normal distribution from a set of data points.

Python最大似然估计如何实现与应用?-图2
(图片来源网络,侵删)

The Math

The probability density function (PDF) for a normal distribution is: $f(x | \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$

For a set of n independent data points, the joint probability (the likelihood) is the product of individual probabilities: $L(\mu, \sigma | x_1, ..., xn) = \prod{i=1}^{n} f(x_i | \mu, \sigma^2)$

To make the math easier (products are messy), we work with the log-likelihood, which turns the product into a sum: $\log L(\mu, \sigma | x_1, ..., xn) = \sum{i=1}^{n} \log f(x_i | \mu, \sigma^2)$

Our goal is to find the values of and that maximize this log-likelihood function.

Python最大似然估计如何实现与应用?-图3
(图片来源网络,侵删)

Python Implementation

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
# 1. Generate some sample data
# Let's pretend we don't know the true parameters
true_mu = 5.0
true_sigma = 2.0
np.random.seed(42)
data = np.random.normal(true_mu, true_sigma, 1000)
print(f"Sample Data Mean: {np.mean(data):.4f}")
print(f"Sample Data Std:  {np.std(data, ddof=0):.4f}") # ddof=0 is MLE for std
print("-" * 20)
# 2. Define the negative log-likelihood function
# We use NEGATIVE because SciPy's 'minimize' function finds the minimum,
# and minimizing the negative log-likelihood is the same as maximizing the log-likelihood.
def negative_log_likelihood(params, data):
    """
    Calculates the negative log-likelihood for a normal distribution.
    Args:
        params (tuple): A tuple containing (mu, sigma).
        data (np.array): The observed data.
    Returns:
        float: The negative log-likelihood value.
    """
    mu, sigma = params
    # Add a small epsilon to sigma to avoid division by zero or log(0)
    if sigma <= 0:
        return np.inf
    n = len(data)
    # The log-likelihood formula for a normal distribution
    log_likelihood = -n/2 * np.log(2 * np.pi * sigma**2) - \
                     (1 / (2 * sigma**2)) * np.sum((data - mu)**2)
    return -log_likelihood # Return the NEGATIVE log-likelihood
# 3. Perform the optimization
# Initial guess for the parameters (mu, sigma)
initial_guess = [0.0, 1.0]
# Use SciPy's minimize function
# The 'method' can be 'Nelder-Mead', 'L-BFGS-B', etc.
result = minimize(
    fun=negative_log_likelihood,
    x0=initial_guess,
    args=(data,),
    method='L-BFGS-B' # A good general-purpose method
)
# 4. Extract the results
if result.success:
    mle_mu, mle_sigma = result.x
    print(f"Optimization Successful!")
    print(f"MLE Estimated Mu: {mle_mu:.4f}")
    print(f"MLE Estimated Sigma: {mle_sigma:.4f}")
else:
    print("Optimization failed.")
    print(result.message)

Output:

Sample Data Mean: 4.9979
Sample Data Std:  1.9989
--------------------
Optimization Successful!
MLE Estimated Mu: 4.9979
MLE Estimated Sigma: 1.9989

As you can see, the MLE estimates (9979, 9989) are extremely close to the true parameters we used to generate the data (0, 0). This demonstrates that the method works correctly.


Practical MLE with SciPy and Statsmodels

While implementing from scratch is great for learning, you'll almost always use pre-built functions in practice.

A. Using scipy.stats

The scipy.stats library has many pre-defined distributions with a .fit() method that uses MLE to find the parameters.

import numpy as np
from scipy import stats
# Generate the same sample data
true_mu = 5.0
true_sigma = 2.0
np.random.seed(42)
data = np.random.normal(true_mu, true_sigma, 1000)
# Fit a normal distribution to the data using MLE
# The fit method returns (shape, loc, scale)
# For a normal distribution, loc=mu, scale=sigma
fitted_mu, fitted_sigma = stats.norm.fit(data)
print("--- SciPy.stats Implementation ---")
print(f"Fitted Mu (loc): {fitted_mu:.4f}")
print(f"Fitted Sigma (scale): {fitted_sigma:.4f}")

Output:

--- SciPy.stats Implementation ---
Fitted Mu (loc): 4.9979
Fitted Sigma (scale): 1.9989

This is much simpler and gives the same result.

B. Using statsmodels

statsmodels is a powerful library for statistical modeling. It's excellent for more complex models like Generalized Linear Models (GLMs).

Let's fit a Linear Regression model, which is fundamentally an MLE problem. We assume the errors are normally distributed, and MLE finds the coefficients that minimize the sum of squared errors (which is equivalent to maximizing the likelihood in this case).

import numpy as np
import statsmodels.api as sm
# 1. Generate some data for a linear regression
# y = 2*x + 5 + noise
np.random.seed(42)
X = np.linspace(0, 10, 100)
noise = np.random.normal(0, 3, 100)
y = 2 * X + 5 + noise
# statsmodels needs a constant (intercept) term
X_with_const = sm.add_constant(X)
# 2. Fit the Ordinary Least Squares (OLS) model
# OLS is a special case of MLE where errors are assumed to be i.i.d. normal.
model = sm.OLS(y, X_with_const)
results = model.fit()
# 3. Print the results
print("\n--- Statsmodels OLS (MLE) Implementation ---")
print(results.summary())
# The estimated parameters are in results.params
print(f"\nEstimated Intercept (const): {results.params[0]:.4f}")
print(f"Estimated Slope (x1):       {results.params[1]:.4f}")

Output (abbreviated):


--- Statsmodels OLS (MLE) Implementation ---
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.915
Model:                            OLS   Adj. R-squared:                  0.914
...
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          5.283
分享:
扫描分享到社交APP
上一篇
下一篇