Of course! Working with audio streams in Python is a common task, especially for applications like voice assistants, real-time audio effects, or data analysis. The most powerful and popular library for this is PyAudio.

I'll guide you through the process, starting with installation and then moving to practical examples.
The Key Library: PyAudio
PyAudio is a Python binding for PortAudio, a cross-platform I/O library for real-time audio. It allows you to play and record audio with simple Python calls.
Installation
First, you need to install PyAudio. The installation can sometimes be tricky due to its C dependencies.
Easiest Method (Pip):

pip install pyaudio
If pip fails (common on Windows or Linux): You might need to install the underlying PortAudio library first.
-
On macOS (using Homebrew):
brew install portaudio pip install pyaudio
-
On Debian/Ubuntu:
sudo apt-get install portaudio19-dev python3-pyaudio
-
On Windows: If
pip install pyaudiofails, try downloading a pre-compiled.whlfile from Christoph Gohlke's Unofficial Windows Binaries for Python Extension Packages. Make sure to download the version that matches your Python version (e.g.,cp39for Python 3.9) and architecture (32-bit or 64-bit).
(图片来源网络,侵删)
Core Concepts: The Audio Stream
An AudioStream is the central object in PyAudio. It's like a pipe for digital audio data.
- Opening a Stream: You create a stream with
pyaudio.PyAudio.open(). - Reading/Writing: You can read data from the stream (for recording) or write data to it (for playback).
- Closing a Stream: When you're done, you must close the stream to release the audio device.
Example 1: Playing a WAV File
This is the most common starting point. We'll read a WAV file and play it through your default speaker.
import pyaudio
import wave
import sys
# --- Configuration ---
CHUNK = 1024 # The number of frames per buffer
FORMAT = pyaudio.paInt16 # Bytes per sample
CHANNELS = 2 # Stereo
RATE = 44100 # Samples per second
if len(sys.argv) < 2:
print("Plays a wave file.\n\nUsage: python playwav.py filename.wav")
sys.exit(-1)
filename = sys.argv[1]
# --- Open the WAV file ---
wf = wave.open(filename, 'rb')
# --- Initialize PyAudio ---
p = pyaudio.PyAudio()
# --- Open a stream ---
# 'output=True' means this stream is for playing audio.
stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
channels=wf.getnchannels(),
rate=wf.getframerate(),
output=True)
# --- Read data in chunks and play ---
print(f"Playing {filename}...")
data = wf.readframes(CHUNK)
while len(data) > 0:
stream.write(data)
data = wf.readframes(CHUNK)
# --- Cleanup ---
print("Playback finished.")
stream.stop_stream()
stream.close()
p.terminate()
To run this:
- Save the code as
playwav.py. - Find a
.wavfile on your computer. - Run it from your terminal:
python playwav.py your_audio_file.wav
Example 2: Recording Audio from a Microphone
This example opens a stream and reads audio data from your microphone, saving it to a new WAV file.
import pyaudio
import wave
# --- Configuration ---
CHUNK = 1024 # Frames per buffer
FORMAT = pyaudio.paInt16 # 16-bit integers
CHANNELS = 1 # Mono recording
RATE = 44100 # Samples per second
RECORD_SECONDS = 5 # Duration of recording
OUTPUT_FILENAME = "output.wav"
# --- Initialize PyAudio ---
p = pyaudio.PyAudio()
# --- Open a stream ---
# 'input=True' means this stream is for recording audio.
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("Recording...")
frames = []
# --- Read data from the microphone in chunks ---
for _ in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("Recording finished.")
# --- Cleanup ---
stream.stop_stream()
stream.close()
p.terminate()
# --- Save the recorded data to a WAV file ---
wf = wave.open(OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
print(f"Saved recording to {OUTPUT_FILENAME}")
Example 3: Real-Time Audio Effects (Echo)
This is where streaming becomes powerful. We'll read audio from the microphone, apply a simple echo effect, and play it back through the speakers in real-time.
How it works:
- We create two streams: one for input (microphone) and one for output (speakers).
- We use a circular buffer (a simple list that wraps around) to store the last
delay_in_framesof audio. - For each chunk of audio we read, we mix it with the audio from
delay_in_framesago and play the result. We also store the current chunk in the buffer for future use.
import pyaudio
import time
# --- Configuration ---
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
CHUNK = 1024
# Echo parameters
ECHO_DELAY_SECONDS = 0.5
ECHO_GAIN = 0.6 # How loud the echo is (0.0 to 1.0)
# Calculate the delay in frames
delay_in_frames = int(RATE * ECHO_DELAY_SECONDS)
# --- Initialize PyAudio ---
p = pyaudio.PyAudio()
# Create a circular buffer for the delay line
delay_buffer = [0.0] * delay_in_frames
buffer_index = 0
# --- Open the input and output streams ---
# We use 'input=True' and 'output=True' for a full-duplex stream.
# Alternatively, you can open two separate streams.
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
output=True,
frames_per_buffer=CHUNK,
input_device_index=None, # Use default mic
output_device_index=None) # Use default speaker
print("Applying echo effect. Press Ctrl+C to stop.")
try:
while True:
# Read a chunk of data from the microphone
input_data = stream.read(CHUNK)
# Convert the byte data to a list of numbers for processing
# We assume 16-bit data, so 2 bytes per sample
input_samples = [int.from_bytes(input_data[i:i+2], 'little', signed=True) for i in range(0, len(input_data), 2)]
# Create an output list
output_samples = []
for sample in input_samples:
# Get the delayed sample from the buffer
delayed_sample = delay_buffer[buffer_index]
# Calculate the output sample (original + delayed/echo)
output_sample = int(sample + delayed_sample * ECHO_GAIN)
# Write the output sample to the stream
output_samples.append(output_sample)
# Update the delay buffer with the current input sample
delay_buffer[buffer_index] = sample
# Move to the next position in the buffer (wrapping around)
buffer_index = (buffer_index + 1) % delay_in_frames
# Convert the processed samples back to byte data and play it
output_bytes = b''.join([sample.to_bytes(2, 'little', signed=True) for sample in output_samples])
stream.write(output_bytes)
except KeyboardInterrupt:
print("Stopping...")
# --- Cleanup ---
stream.stop_stream()
stream.close()
p.terminate()
Summary and Advanced Topics
| Task | Key PyAudio Functions |
Concept |
|---|---|---|
| Playback | p.open(output=...), stream.write(data) |
Writing raw audio data (bytes) to an output stream. |
| Recording | p.open(input=...), stream.read(CHUNK) |
Reading raw audio data (bytes) from an input stream. |
| Full-Duplex | p.open(input=..., output=...) |
Simultaneously reading from and writing to a stream. |
| Format | FORMAT, CHANNELS, RATE |
These must match between your source (e.g., WAV file) and the stream. |
When to look for alternatives:
pydub: A higher-level library built aroundPyAudioandffmpeg. It's fantastic for simple audio manipulation tasks like cutting, concatenating, converting formats, and applying effects without needing to work with raw byte data.sounddevice: A modern alternative toPyAudiothat is often easier to install and has a cleaner API. It's built on top of PortAudio as well, so the core concepts are the same.librosa: The go-to library for audio analysis and music information retrieval. It's not designed for real-time streaming but is excellent for analyzing audio files (e.g., for pitch detection, beat tracking, spectrograms).
