Python内存模型如何实现变量存储与访问？-杰瑞科技汇

Of course. Let's break down the Python Memory Model in a comprehensive way, from the core concepts to practical implications.

（图片来源网络，侵删）

What is the Memory Model?

In the context of programming, a memory model defines how a program's memory is organized and how different parts of the program (like threads) interact with that memory.

Specifically, for a language like Python, the memory model has two critical aspects:

The Abstract Representation: How Python thinks about memory. This is what you, the developer, interact with most of the time.
The Concrete Implementation: How the underlying interpreter (CPython) actually manages memory in your computer's RAM and CPU caches. This is crucial for understanding performance and, most importantly, threading.

We will cover both.

Part 1: The Abstract Memory Model (Developer's View)

From a high-level perspective, Python manages memory for you automatically. You don't need to manually allocate or deallocate memory like in C or C++. This is managed by a system of private namespaces, or scopes.

（图片来源网络，侵删）

Names, Variables, and Objects

This is the most fundamental concept to understand. In Python, there is no such thing as a "variable" in the C sense (a named memory slot that holds a value).

Instead, think of it this way:

Objects: These are the actual data in memory. Everything in Python is an object: integers, strings, lists, functions, classes, etc. An object has three things:
1. A type (e.g., int, str, list).
2. A value (e.g., 42, "hello", [1, 2, 3]).
3. A unique identity (its address in memory, accessible via id(obj)).
Names (or Identifiers): These are the labels you use in your code (e.g., x, my_list, calculate).
Namespaces: A dictionary-like mapping that connects names to objects. You can think of it as a lookup table.

The core principle is: Names are just labels that point to objects in memory.

# Let's see this in action
x = 300
y = x
print(f"x is: {x}, id: {id(x)}")
print(f"y is: {y}, id: {y}")
# x and y point to the SAME integer object in memory
print(f"Are x and y the same object? {x is y}") # 'is' checks for object identity
# Now, let's reassign x
x = 400
print(f"\nAfter reassigning x:")
print(f"x is: {x}, id: {id(x)}") # x now points to a NEW integer object
print(f"y is: {y}, id: {id(y)}") # y is still pointing to the original object
print(f"Are x and y the same object now? {x is y}")

Key Takeaway: When you write y = x, you are not creating a copy of the object. You are simply creating a new name, y, that points to the exact same object that x is already pointing to.

Mutable vs. Immutable Objects

This distinction is critical and is a direct consequence of the "names point to objects" model.

Immutable Objects: Objects whose state cannot be changed after creation.
- Examples: int, float, str, tuple, frozenset.
- When you perform an operation on an immutable object, Python doesn't change it. Instead, it creates a new object with the new value and rebinds the name to this new object.
```
s = "hello"
print(f"Before: id(s) = {id(s)}")
# The line below does NOT change the original "hello" string.
# It creates a new string "world" and makes s point to it.
s = "world"
print(f"After:  id(s) = {id(s)}") # The ID has changed!
```

Mutable Objects: Objects whose state can be changed after creation.

Examples: list, dict, set, custom class instances.
Operations on mutable objects modify the object in-place. The name continues to point to the same object, but the contents of that object have changed.

my_list = [1, 2, 3]
print(f"Before: id(my_list) = {id(my_list)}")
# The line below MODIFIES the list object in-place.
# It does not create a new list.
my_list.append(4)
print(f"After:  id(my_list) = {id(my_list)}") # The ID is the same!
print(my_list) # Output: [1, 2, 3, 4]

This is why a common pitfall for beginners is with mutable default arguments:

def append_to_list(item, my_list=[]): # The default list is created ONCE when the function is defined
    my_list.append(item)
    return my_list
print(append_to_list(1)) # Output: [1]
print(append_to_list(2)) # Output: [1, 2] <-- Oh no! The same list is being reused!

Garbage Collection

Since Python manages memory for you, it must also have a way to clean up objects that are no longer in use. This process is called Garbage Collection (GC).

Python primarily uses a reference counting mechanism, supplemented by a generistic garbage collector.

Reference Counting: Every object has a count of how many names are pointing to it.
- When a name is assigned to an object, its count increases.
- When a name is reassigned or goes out of scope, the count for the old object decreases.
- When an object's reference count drops to zero, it is immediately deallocated, and its memory is freed.
```
a = []  # A list is created. ref_count = 1
b = a  # Another name points to it. ref_count = 2
del a   # 'a' is deleted. ref_count = 1
# The list is not deleted yet because 'b' still points to it.
del b   # 'b' is deleted. ref_count = 0
# Now, the list is garbage collected and its memory is freed.
```
Generational Garbage Collector: Reference counting is fast but has a flaw: it can't handle reference cycles (e.g., two objects that point to each other, so their reference count never reaches zero).

The generistic GC solves this. It divides objects into three "generations" (0, 1, and 2). New objects start in Gen 0. The GC frequently scans Gen 0. If an object survives a few collection cycles, it's "promoted" to an older generation (Gen 1, then Gen 2). Older generations are scanned less frequently. This approach is much more efficient than scanning all objects all the time.

Part 2: The Concrete Memory Model (CPython's Implementation & Threading)

This is where the memory model gets serious. When you use multiple threads, you need to know how they interact with memory to avoid race conditions and bugs.

The Global Interpreter Lock (GIL)

This is the single most important concept for understanding Python's concurrency model.

What it is: The GIL is a mutex (a lock) that protects access to Python objects, preventing multiple native threads from executing Python bytecode at the same time within a single process.
Why it exists: It simplifies memory management. Because of the GIL, only one thread can execute Python bytecode at any given moment. This means that the reference counting mechanism (and other C API calls) is inherently thread-safe. Without the GIL, you'd need much more complex locking mechanisms to prevent race conditions on every single object, which would cripple performance for single-threaded code.

The GIL is not about protecting your data; it's about protecting Python's internal data structures.

Implications of the GIL

True Parallelism is Limited for CPU-bound Tasks: For tasks that are heavy on CPU computation (e.g., mathematical calculations, image processing), the GIL is a bottleneck. Even if you have multiple CPU cores, only one thread can run at a time. The Python interpreter will switch between threads, but this doesn't give you a speedup; it can even add a small overhead.
```
# This will NOT run faster on a multi-core CPU due to the GIL
import threading
def count(n):
    while n > 0:
        n -= 1
t1 = threading.Thread(target=count, args=(50_000_000,))
t2 = threading.Thread(target=count, args=(50_000_000,))
t1.start()
t2.start()
t1.join()
t2.join()
```
I/O-Bound Tasks Can Benefit: For tasks that spend most of their time waiting for external resources (e.g., network requests, disk I/O), the GIL is released. While one thread is waiting for a network response, the GIL is freed, and another thread can run. This makes threading very effective for I/O-bound applications.
How to Achieve Parallelism: If you need to speed up CPU-bound tasks in Python, you must use a process-based approach, which bypasses the GIL.
- multiprocessing Module: Creates separate processes, each with its own Python interpreter and memory space. Communication between processes is more expensive (e.g., via pipes or queues), but they can run on different CPU cores in true parallel.
- concurrent.futures.ProcessPoolExecutor: A high-level interface for using the multiprocessing module.

Memory Visibility and the `threading` Module

Even with the GIL, you can still have race conditions. Consider this classic example:

import threading
counter = 0
lock = threading.Lock() # We'll fix this with a lock
def increment():
    global counter
    for _ in range(1_000_000):
        # This is NOT an atomic operation!
        # It involves: 1. Read counter, 2. Add 1, 3. Write counter back
        counter += 1
threads = [threading.Thread(target=increment) for _ in range(2)]
for t in threads:
    t.start()
for t in threads:
    t.join()
print(f"Final counter value: {counter}") # Likely not 2,000,000!

Why does this fail? The operation is not "atomic." Here's what can happen:

Thread A reads counter (value is 0).
Thread B reads counter (value is 0).
Thread A calculates 0 + 1 = 1.
Thread B calculates 0 + 1 = 1.
Thread A writes 1 back to counter.
Thread B writes 1 back to counter.

The final value is 1, not 2. This is a race condition.

The Solution: Locks

To fix this, you use a threading.Lock. A lock ensures that only one thread can execute a specific block of code at a time. The GIL is released while a thread is waiting to acquire a lock, allowing other threads to run.

import threading
counter = 0
lock = threading.Lock() # Create a lock
def increment_with_lock():
    global counter
    for _ in range(1_000_000):
        with lock: # Acquire the lock, execute block, release lock automatically
            counter += 1
# ... rest of the code is the same

Now, the counter += 1 operation is atomic. If Thread A is inside the with lock: block, Thread B will be forced to wait until Thread A is finished.

Summary Table

Concept	Description	Key Implications
Names & Objects	Names are labels pointing to objects in memory.	`a = b` means `a` and `b` point to the same object. Be careful with mutable objects.
Immutability	Objects like `int`, `str` cannot be changed. Operations create new objects.	Reassigning an immutable variable changes its `id`.
Mutability	Objects like `list`, `dict` can be changed in-place.	Modifying a mutable object does not change its `id`. Watch out for shared mutable state.
Garbage Collection	Automatic reclamation of memory for objects with zero references.	Uses reference counting (fast) and a generistic GC (to handle cycles).
Global Interpreter Lock (GIL)	A mutex that allows only one thread to execute Python bytecode at a time.	CPU-bound tasks don't get a speedup from `threading`. I/O-bound tasks do. Use `multiprocessing` for CPU-bound parallelism.
Race Conditions	When multiple threads access shared data and try to change it simultaneously, leading to unpredictable results.	Can occur even with the GIL because operations like are not atomic. The solution is to use `threading.Lock`.

Python内存模型如何实现变量存储与访问？

What is the Memory Model?

Part 1: The Abstract Memory Model (Developer's View)

Names, Variables, and Objects

Mutable vs. Immutable Objects

Garbage Collection

Part 2: The Concrete Memory Model (CPython's Implementation & Threading)

The Global Interpreter Lock (GIL)

Implications of the GIL

Memory Visibility and the `threading` Module

Summary Table

99ANYc3cd6

Java的char转string有几种方法？

Python中except与raise如何协同处理异常？

mac系统如何从u盘启动？

ArrayList和数组有何区别？

HBase Thrift Python如何连接与使用？

HDR Light Studio教程如何快速上手？

Linux下Python用Spyder如何高效调试？

python socket flush

Python中raise与except该如何正确配合使用？

红米note增强版刷机教程

java中stringbuffer

Python字典如何按值或键排序？

NoteExpress教程怎么用？从入门到精通看这篇！

Java Object类为何是所有类的根类？

Python如何调用REST API接口？

Java OutputStream如何实现高效数据写入？

Python内存模型如何实现变量存储与访问？

What is the Memory Model?

Part 1: The Abstract Memory Model (Developer's View)

Names, Variables, and Objects

Mutable vs. Immutable Objects

Garbage Collection

Part 2: The Concrete Memory Model (CPython's Implementation & Threading)

The Global Interpreter Lock (GIL)

Implications of the GIL

Memory Visibility and the threading Module

Summary Table

相关推荐

HDR Light Studio教程如何快速上手？

Memory Visibility and the `threading` Module