杰瑞科技汇

Python numpy random 如何正确使用?

Of course! Here is a comprehensive guide to using the numpy.random module, which is the cornerstone of random number generation in Python for scientific computing.

Python numpy random 如何正确使用?-图1
(图片来源网络,侵删)

Introduction to numpy.random

The numpy.random module provides a wide range of functions for generating random numbers. These functions are essential for:

  • Simulations: Modeling real-world phenomena (e.g., stock prices, weather, particle physics).
  • Statistical Analysis: Creating random samples for hypothesis testing, bootstrapping, and Monte Carlo methods.
  • Machine Learning: Initializing weights, creating data augmentation, and splitting datasets.
  • Testing: Generating random test data.

Key Concept: The Random State

A crucial feature of numpy.random is its use of a pseudo-random number generator (PRNG). This means the numbers aren't truly random but are generated by a deterministic algorithm from an initial value called a seed.

If you use the same seed, you will get the exact same sequence of "random" numbers every time. This is vital for reproducibility in scientific research and testing.

Python numpy random 如何正确使用?-图2
(图片来源网络,侵删)

Setting the Seed for Reproducibility

This is the most important first step in any script that uses randomness.

import numpy as np
# Set the seed to a specific number (e.g., 42)
np.random.seed(42)
# Generate some random numbers
print("First call:", np.random.rand(3)) 
# Output: First call: [0.37454012 0.95071431 0.73199394]
# Reset the seed to the same number
np.random.seed(42)
# Generate the same numbers again
print("Second call (with same seed):", np.random.rand(3))
# Output: Second call (with same seed): [0.37454012 0.95071431 0.73199394]

Modern Approach: RandomState Object

For more complex applications, it's better to create a RandomState object. This allows you to have multiple, independent random number generators within the same program without interfering with each other or the global numpy state.

import numpy as np
# Create a RandomState object
rs1 = np.random.RandomState(42)
rs2 = np.random.RandomState(123)
# Use the objects to generate numbers
print("From rs1:", rs1.rand(3))
# Output: From rs1: [0.37454012 0.95071431 0.73199394]
print("From rs2:", rs2.rand(3))
# Output: From rs2: [0.69646919 0.28613933 0.22685145]
# Re-initialize with the same seed
rs1_again = np.random.RandomState(42)
print("From rs1 again:", rs1_again.rand(3))
# Output: From rs1 again: [0.37454012 0.95071431 0.73199394]

Commonly Used Random Number Distributions

numpy.random offers functions for many probability distributions. Here are the most common ones.

Python numpy random 如何正确使用?-图3
(图片来源网络,侵删)

A. Uniform Distribution: Numbers between a range

np.random.rand(d0, d1, ..., dn) Generates random numbers from a uniform distribution over [0, 1). The arguments are the dimensions of the output array.

# Generate a single random float between 0 and 1
print(np.random.rand()) 
# Generate a 1D array of 5 random numbers
print(np.random.rand(5))
# Generate a 2x3 matrix of random numbers
print(np.random.rand(2, 3))

np.random.uniform(low=0.0, high=1.0, size=None) More flexible version of rand. You can specify the low and high bounds.

# Generate 5 random numbers between 10 and 20
print(np.random.uniform(low=10, high=20, size=5))

B. Normal (Gaussian) Distribution: The "Bell Curve"

np.random.randn(d0, d1, ..., dn) Generates random numbers from a "standard" normal distribution (mean=0, standard deviation=1).

# Generate a 1D array of 5 numbers from a standard normal distribution
print(np.random.randn(5))
# Generate a 3x2 matrix
print(np.random.randn(3, 2))

np.random.normal(loc=0.0, scale=1.0, size=None) More flexible version. loc is the mean, and scale is the standard deviation.

# Generate 5 numbers from a normal distribution with mean=100 and std=15
print(np.random.normal(loc=100, scale=15, size=5))

C. Integers

np.random.randint(low, high, size=None) Generates random integers from low (inclusive) to high (exclusive).

# Generate a single random integer between 1 and 10 (inclusive of 1, exclusive of 10)
print(np.random.randint(1, 10))
# Generate a 1D array of 5 random integers between 0 and 100
print(np.random.randint(0, 100, size=5))
# Generate a 2x3 matrix of random integers between 50 and 100
print(np.random.randint(50, 100, size=(2, 3)))

D. Other Common Distributions

Function Description Example
np.random.poisson(lam=1.0, size=None) Poisson distribution. lam is the rate (lambda). np.random.poisson(lam=5, size=10)
np.random.binomial(n, p, size=None) Binomial distribution. n is trials, p is probability. np.random.binomial(n=10, p=0.5, size=20)
np.random.exponential(scale=1.0, size=None) Exponential distribution. scale is 1/lambda. np.random.exponential(scale=2.0, size=5)
np.random.beta(a, b, size=None) Beta distribution. a and b are shape parameters. np.random.beta(a=2, b=5, size=5)

Permutations and Shuffling

These functions are useful for randomly ordering data.

np.random.shuffle(x) Shuffles a sequence in-place. The original array is modified.

arr = np.arange(10)
print("Original array:", arr)
np.random.shuffle(arr)
print("Shuffled array:", arr)

np.random.permutation(x) Returns a new shuffled array and leaves the original array unchanged. If x is an integer, it first creates an arange(x) and then shuffles it.

arr = np.arange(10)
print("Original array:", arr)
shuffled_arr = np.random.permutation(arr)
print("New shuffled array:", shuffled_arr)
print("Original array (unchanged):", arr)

Random Choice and Sampling

np.random.choice(a, size=None, replace=True, p=None) Chooses random elements from a given 1D array.

  • a: The array or integer range to choose from.
  • size: The number of samples to draw.
  • replace: If True, a sample can be drawn multiple times. If False, sampling is without replacement.
  • p: The probabilities associated with each entry in a. Must sum to 1.
# Choose 5 numbers from 0 to 9, with replacement (default)
print(np.random.choice(10, size=5))
# Choose 3 unique numbers from 0 to 9, without replacement
print(np.random.choice(10, size=3, replace=False))
# Choose 3 letters from 'abcdef' with specified probabilities
letters = ['a', 'b', 'c', 'd', 'e', 'f']
probabilities = [0.1, 0.1, 0.1, 0.1, 0.1, 0.5] # 'f' is much more likely
print(np.random.choice(letters, size=3, p=probabilities))

The Modern API: numpy.random.Generator (Recommended since NumPy 1.17)

NumPy has been moving towards a more robust and flexible API centered around the Generator object. This is now the recommended way to use random number generation. It separates the random state from the generation logic and provides access to newer algorithms.

You create a Generator instance using np.random.default_rng().

import numpy as np
# Create a Generator instance
# You can pass a seed for reproducibility
rng = np.random.default_rng(seed=42)
# The Generator object has all the same functions, but as methods
# This is the modern, preferred way
# Uniform distribution
print("Uniform (0-1):", rng.random(size=3))
# Normal distribution
print("Normal (mean=0, std=1):", rng.standard_normal(size=3))
# Integers
print("Integers (1-10):", rng.integers(low=1, high=10, size=5))
# Permutation
arr = np.arange(10)
print("Permutation:", rng.permutation(arr))
# Choice
print("Choice:", rng.choice(['a', 'b', 'c', 'd'], size=10, replace=True, p=[0.1, 0.2, 0.3, 0.4]))

Summary: Which to Use?

If you want... Use... Example
Reproducibility np.random.seed() or a RandomState object np.random.seed(123)
Quick & simple scripts Global functions (np.random.rand, np.random.randn) np.random.randint(0, 100, 10)
Best Practice (Modern Code) The Generator API (np.random.default_rng()) rng = np.random.default_rng(seed=42)
rng.random(5)
Independent random streams A RandomState object rs = np.random.RandomState(123)
分享:
扫描分享到社交APP
上一篇
下一篇