杰瑞科技汇

Python portalocker如何实现文件锁?

Of course! portalocker is a fantastic and simple Python library for handling file locking. It provides a cross-platform, easy-to-use interface to the operating system's file locking mechanisms.

Python portalocker如何实现文件锁?-图1
(图片来源网络,侵删)

This is an essential tool for preventing race conditions when multiple processes or threads need to read from and write to the same file simultaneously.

What is File Locking and Why is it Needed?

Imagine two processes, Process A and Process B, both trying to write to a log file app.log.

  1. Process A opens app.log, reads the current content, and prepares to append a new log entry.
  2. Process B also opens app.log, reads the same current content (because Process A hasn't saved its changes yet), and prepares to append its own log entry.
  3. Process A writes its entry and saves the file.
  4. Process B writes its entry and saves the file, overwriting what Process A just did.

The result? Data loss. Process A's log entry is gone.

File locking solves this. When a process "locks" a file, it tells the operating system, "I am working on this file. No other process should be able to write to it until I'm done and release the lock." If Process B tries to access the file while Process A has it locked, it will either be forced to wait (a "blocking" lock) or will receive an error immediately (a "non-blocking" lock).

Python portalocker如何实现文件锁?-图2
(图片来源网络,侵删)

The portalocker Library

portalocker simplifies this by providing a few key functions that work on Windows, macOS, and Linux.

Key Functions

  • portalocker.lock(file, flags): The core function. Acquires a lock on a file object.
  • portalocker.unlock(file): Releases a lock on a file object.
  • portalocker.lock(file, flags, timeout=None): A common convenience wrapper that handles the locking and unlocking for you using a with statement.

Lock Flags

The flags argument is crucial and determines the type of lock:

  • portalocker.LOCK_EX: Exclusive Lock. Only one process can hold this lock at a time. This is what you need for writing to a file to prevent data corruption.
  • portalocker.LOCK_SH: Shared Lock. Multiple processes can hold this lock simultaneously. This is useful for reading. A process with a shared lock cannot acquire an exclusive lock until all shared locks are released.
  • portalocker.LOCK_NB: Non-Blocking. If you can't acquire the lock immediately, raise an IOError (or BlockingIOError in Python 3) instead of waiting. This is used with LOCK_EX or LOCK_SH.

The most common combination is portalocker.LOCK_EX | portalocker.LOCK_NB for an exclusive, non-blocking lock.


Installation

First, you need to install the library. It's available on PyPI.

Python portalocker如何实现文件锁?-图3
(图片来源网络,侵删)
pip install portalocker

Practical Examples

Let's look at some common scenarios.

Example 1: Simple Exclusive Lock (Blocking)

This is the most common use case: ensuring only one process can write to a file at a time. The with statement is your best friend here, as it guarantees the lock is released even if errors occur.

# writer_process.py
import portalocker
import time
import random
def write_to_file(filename, data):
    try:
        # 'a' for append mode, creates the file if it doesn't exist
        with open(filename, 'a') as f:
            # The 'with' statement handles locking and unlocking automatically
            # By default, it uses an exclusive (LOCK_EX) and blocking lock.
            portalocker.lock(f, portalocker.LOCK_EX)
            print(f"Process {os.getpid()} acquired the lock.")
            # Simulate some work
            time.sleep(random.uniform(0.1, 0.5))
            f.write(data + "\n")
            f.flush() # Ensure data is written to disk
            print(f"Process {os.getpid()} wrote to the file and released the lock.")
    except Exception as e:
        print(f"Process {os.getpid()} failed: {e}")
if __name__ == '__main__':
    import os
    filename = 'shared_log.txt'
    write_to_file(filename, f"Log entry from {os.getpid()} at {time.time()}")

To see this in action, run this script from two different terminal windows almost simultaneously. You will see one process acquire the lock, write, and release it before the second one is allowed to proceed.

Example 2: Non-Blocking Lock

Sometimes, you don't want your program to wait. You want to know immediately if the resource is busy and then do something else (like try again later or exit).

# non_blocking_writer.py
import portalocker
import time
import os
def try_write(filename, data):
    try:
        with open(filename, 'a') as f:
            # Try to get an exclusive, non-blocking lock
            portalocker.lock(f, portalocker.LOCK_EX | portalocker.LOCK_NB)
            print(f"Process {os.getpid()} got the lock immediately!")
            f.write(data + "\n")
            f.flush()
    except (portalocker.LockException, BlockingIOError) as e:
        # This exception is raised if the lock cannot be acquired
        print(f"Process {os.getpid()} could NOT get the lock. File is busy. Error: {e}")
        # Here you could implement a retry logic
        # time.sleep(1)
        # try_write(filename, data)
if __name__ == '__main__':
    filename = 'shared_log.txt'
    try_write(filename, f"Urgent log from {os.getpid()} at {time.time()}")

If you run this while another process is holding the blocking lock from Example 1, this script will fail immediately with a LockException.

Example 3: Shared vs. Exclusive Lock

This example demonstrates how shared locks allow multiple readers but block a writer.

# reader.py
import portalocker
import time
import os
def read_file(filename):
    try:
        with open(filename, 'r') as f:
            # Acquire a shared (LOCK_SH) lock
            portalocker.lock(f, portalocker.LOCK_SH)
            print(f"Reader {os.getpid()} acquired a SHARED lock.")
            content = f.read()
            print(f"Reader {os.getpid()} read: {content.strip()}")
            # Simulate reading for a while
            time.sleep(2)
    except Exception as e:
        print(f"Reader {os.getpid()} failed: {e}")
if __name__ == '__main__':
    filename = 'shared_data.txt'
    # Create a dummy file first
    with open(filename, 'w') as f:
        f.write("Initial data\n")
    read_file(filename)

Now, imagine this scenario:

  1. Start two reader.py processes at the same time. They will both acquire a shared lock and can read the file simultaneously.
  2. While both readers are "reading" (i.e., sleeping), start a writer.py process (from Example 1). The writer will be blocked and will have to wait until both readers have released their shared locks before it can acquire an exclusive lock.

Best Practices and Alternatives

  1. Always use with: The with statement is the safest way to work with locks. It ensures portalocker.unlock() is called, even if your code raises an exception. This prevents "deadlocks" where a file is left locked forever.

  2. Lock Granularity: Lock a small, dedicated "lock file" instead of the data file itself. This can be more efficient and prevents potential issues with some applications that might open the data file without respecting locks.

    # Good practice
    DATA_FILE = 'data.csv'
    LOCK_FILE = 'data.csv.lock'
    with open(LOCK_FILE, 'w') as lock_f:
        portalocker.lock(lock_f, portalocker.LOCK_EX)
        # Now it's safe to read/write DATA_FILE
        with open(DATA_FILE, 'r+') as data_f:
            # ... do work ...
    # The lock file is automatically closed and unlocked here
  3. Alternatives:

    • fcntl (Unix-only): The standard library module for file locking on Unix-like systems. It's more powerful but not portable to Windows.
    • msvcrt (Windows-only): The standard library module for file locking on Windows. Again, not portable.
    • threading.Lock: This is for locking resources within a single process between different threads. It does not work for inter-process communication. portalocker is the correct tool for multi-process locking.

Summary

Feature Description
What it is A simple, cross-platform library for file locking.
Primary Use Case Preventing race conditions when multiple processes access the same file.
Key Function portalocker.lock(file, flags)
Best Practice Use the with portalocker.lock(...) context manager for safety.
Common Flags LOCK_EX (exclusive/write), LOCK_SH (shared/read), LOCK_NB (non-blocking).
Installation pip install portalocker
分享:
扫描分享到社交APP
上一篇
下一篇