Of course! Here is a comprehensive guide to implementing a Gibbs sampler in Python, including the theory, a step-by-step implementation, and a practical example.

What is a Gibbs Sampler?
The Gibbs sampler is a Markov Chain Monte Carlo (MCMC) algorithm used to generate a sequence of samples from a multivariate probability distribution when direct sampling is difficult. It's particularly useful for joint posterior distributions in Bayesian statistics.
The core idea is to break down a complex, high-dimensional problem into a series of simpler, one-dimensional problems.
How it works:
- Start with an initial guess for all variables.
- For each variable in turn, sample a new value from its conditional distribution, holding all other variables fixed at their current values.
- Repeat this process for many iterations. After an initial "burn-in" period, the sequence of samples will approximate the joint distribution you're interested in.
Key Requirement: You must know the analytical form of the conditional distribution for each variable. If you don't, you can't use a Gibbs sampler directly (though you could use a Metropolis-Hastings step within it, creating a "Metropolis-within-Gibbs" sampler).

A Simple Example: The Bivariate Normal Distribution
Let's implement a Gibbs sampler for a bivariate normal distribution. This is a classic example because the conditional distributions are also normal and easy to derive and sample from.
The Model:
We want to sample from a joint distribution p(x, y). The conditional distributions are:
p(x | y)is a Normal distribution.p(y | x)is a Normal distribution.
Assumptions:
Let's assume the joint distribution p(x, y) is a standard bivariate normal with:
- Mean
μ = [0, 0] - Covariance matrix
Σ = [[1.0, 0.8], [0.8, 1.0]]
From the properties of the multivariate normal distribution, we can derive the conditionals:

p(x | y) ~ Normal(μ_x + ρ * σ_x/σ_y * (y - μ_y), σ_x² * (1 - ρ²))p(y | x) ~ Normal(μ_y + ρ * σ_y/σ_x * (x - μ_x), σ_y² * (1 - ρ²))
For our standard case (μ_x=μ_y=0, σ_x=σ_y=1), this simplifies to:
x | y ~ Normal(ρ * y, 1 - ρ²)y | x ~ Normal(ρ * x, 1 - ρ²)
where (rho) is the correlation coefficient, which is 8 in our case.
Python Implementation
Here is a complete, well-commented Python implementation of the Gibbs sampler for our bivariate normal example.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
def gibbs_sampler(num_samples, rho, initial_x, initial_y):
"""
Implements a Gibbs sampler for a bivariate normal distribution.
Args:
num_samples (int): The total number of samples to generate.
rho (float): The correlation coefficient between x and y.
initial_x (float): The initial value for x.
initial_y (float): The initial value for y.
Returns:
tuple: A tuple containing two numpy arrays: (samples_x, samples_y).
"""
# Pre-calculate the variance for the conditional distributions
cond_var = 1 - rho**2
# Initialize arrays to store the samples
samples_x = np.zeros(num_samples)
samples_y = np.zeros(num_samples)
# Set initial values
current_x = initial_x
current_y = initial_y
# --- Gibbs Sampling Loop ---
for i in range(num_samples):
# 1. Sample new x from p(x | y)
# The conditional mean is rho * current_y
# The conditional variance is cond_var
mean_x = rho * current_y
current_x = np.random.normal(mean_x, np.sqrt(cond_var))
# 2. Sample new y from p(y | x)
# The conditional mean is rho * current_x
# The conditional variance is cond_var
mean_y = rho * current_x
current_y = np.random.normal(mean_y, np.sqrt(cond_var))
# 3. Store the sampled values
samples_x[i] = current_x
samples_y[i] = current_y
return samples_x, samples_y
# --- Main Execution ---
# 1. Define parameters
num_samples = 10000
rho = 0.8 # Correlation coefficient
initial_x = 0.0
initial_y = 0.0
# 2. Run the Gibbs sampler
samples_x, samples_y = gibbs_sampler(num_samples, rho, initial_x, initial_y)
# 3. Analyze and visualize the results
# a) Plot the trace (the value of the chain over time)
# This helps diagnose convergence issues.
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
axes[0].plot(samples_x, alpha=0.7)
axes[0].set_title('Trace of X')
axes[0].set_xlabel('Iteration')
axes[0].set_ylabel('Value of X')
axes[1].plot(samples_y, alpha=0.7, color='orange')
axes[1].set_title('Trace of Y')
axes[1].set_xlabel('Iteration')
axes[1].set_ylabel('Value of Y')
plt.tight_layout()
plt.show()
# b) Plot the autocorrelation
# This shows how correlated each sample is with the previous ones.
# Lower autocorrelation is generally better.
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
axes[0].acorr(samples_x - np.mean(samples_x), maxlags=50)
axes[0].set_title('Autocorrelation of X')
axes[0].set_xlabel('Lag')
axes[0].set_ylabel('Autocorrelation')
axes[1].acorr(samples_y - np.mean(samples_y), maxlags=50, color='orange')
axes[1].set_title('Autocorrelation of Y')
axes[1].set_xlabel('Lag')
axes[1].set_ylabel('Autocorrelation')
plt.tight_layout()
plt.show()
# c) Plot the joint distribution of the samples
# This should look like the bivariate normal we were trying to sample from.
plt.figure(figsize=(8, 8))
sns.jointplot(x=samples_x, y=samples_y, kind='kde', fill=True)
plt.suptitle('Joint Distribution of Samples (KDE Plot)', y=1.02)
plt.show()
# d) Plot the raw samples to see the chain's path
plt.figure(figsize=(8, 8))
plt.plot(samples_x, samples_y, alpha=0.5, lw=0.5)
plt.xlabel('X')
plt.ylabel('Y')'Path of the Gibbs Sampler in the X-Y Plane')
plt.grid(True)
plt.show()
# e) Print summary statistics
print("\n--- Summary Statistics ---")
print(f"Mean of X: {np.mean(samples_x):.4f} (Expected: 0.0)")
print(f"Std Dev of X: {np.std(samples_x):.4f} (Expected: 1.0)")
print(f"Mean of Y: {np.mean(samples_y):.4f} (Expected: 0.0)")
print(f"Std Dev of Y: {np.std(samples_y):.4f} (Expected: 1.0)")
print(f"Correlation between X and Y: {np.corrcoef(samples_x, samples_y)[0, 1]:.4f} (Expected: 0.8)")
Explanation of the Code
-
gibbs_samplerfunction:- It takes the number of samples, the correlation
rho, and initial values forxandy. - It pre-calculates the conditional variance
1 - rho**2for efficiency. - It initializes two NumPy arrays to store the generated samples.
- The main
forloop iteratesnum_samplestimes. - Inside the loop, it first samples a new
xfromp(x | y)usingnp.random.normalwith the conditional mean (rho * current_y) and variance. - Then, it samples a new
yfromp(y | x)using the newly updatedcurrent_x. - The samples are stored in the arrays.
- Finally, it returns the collected samples.
- It takes the number of samples, the correlation
-
Main Execution Block:
- We define our parameters.
- We call the
gibbs_samplerfunction. - Trace Plot: Shows the value of
xandyat each iteration. A "good" trace looks like a "fuzzy caterpillar," indicating that the chain has stabilized and is exploring the distribution. If it shows a strong upward or downward trend, it may not have converged. - **Autocorrelation
