Python executemany如何高效批量插入数据？-杰瑞科技汇

Of course! Let's dive deep into executemany() in Python, specifically when used with the sqlite3 module. This is a fundamental concept for efficient database operations.

（图片来源网络，侵删）

What is `executemany()`?

executemany() is a method provided by database cursor objects in Python's database API (like sqlite3, psycopg2 for PostgreSQL, mysql-connector-python for MySQL, etc.). Its primary purpose is to execute a single SQL statement multiple times, with different parameters for each execution.

This is the standard, high-performance way to insert, update, or delete multiple rows in a database.

The Core Problem: Why Not Use a `for` Loop with `execute()`?

Imagine you want to insert 1,000 users into a database. You could do this:

# The INEFFICIENT way
import sqlite3
conn = sqlite3.connect('mydatabase.db')
cursor = conn.cursor()
users = [
    ('Alice', 30),
    ('Bob', 25),
    ('Charlie', 35),
    # ... 997 more users
]
for user in users:
    cursor.execute("INSERT INTO users (name, age) VALUES (?, ?)", user)
conn.commit()
conn.close()

This approach works, but it's very slow for a few key reasons:

（图片来源网络，侵删）

Network Overhead (Client-Server): If your database is on a different server, each execute() call involves a round-trip over the network. For 1,000 users, that's 1,000 network trips, which is extremely slow.
Server Overhead (Local DB): Even with a local database like SQLite, the database server has to parse, compile, and plan the execution of the SQL statement 1,000 separate times. This creates significant overhead.

The Solution: `executemany()`

executemany() solves these problems by batching the operations. It sends the data to the database server in a single operation.

The syntax is:

cursor.executemany(sql, seq_of_parameters)

sql: The SQL statement string. It must contain placeholders for the data.
seq_of_parameters: A sequence (like a list or tuple) of sequences (like tuples or lists) where each inner sequence contains the parameters for one execution.

Example: Inserting Multiple Users with `executemany()`

Let's use the same example of inserting users, but this time the correct and efficient way.

Setup the Database

First, let's create a database table to work with.

import sqlite3
# Connect to the database (it will be created if it doesn't exist)
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
# Create a table
cursor.execute('''
    CREATE TABLE IF NOT EXISTS users (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT NOT NULL,
        age INTEGER
    )
''')
# Commit the schema change
conn.commit()
print("Table 'users' is ready.")

Use `executemany()` to Insert Data

Now, let's insert a list of users efficiently.

# Data to be inserted
users_to_insert = [
    ('Alice', 30),
    ('Bob', 25),
    ('Charlie', 35),
    ('David', 28),
    ('Eve', 40)
]
# The SQL statement with placeholders
sql_insert = "INSERT INTO users (name, age) VALUES (?, ?)"
# Use executemany to insert all users at once
try:
    cursor.executemany(sql_insert, users_to_insert)
    conn.commit()
    print(f"{cursor.rowcount} records inserted successfully.")
except sqlite3.Error as e:
    print(f"An error occurred: {e}")
finally:
    # Close the connection
    conn.close()

Key Takeaways from this example:

Efficiency: The entire executemany operation is typically handled in a single communication with the database.
cursor.rowcount: After the operation, cursor.rowcount tells you how many rows were affected. In this case, it will be 5.
Placeholder Style: We used which is the standard placeholder style for sqlite3. Other database drivers may use different styles, like %s (which is a common convention, though not a Python string formatting directive in this context) or named placeholders (name).

`executemany()` vs. `executemany()` with a Transaction

For maximum performance, especially with a very large number of inserts, you should wrap the executemany() call in a transaction. A transaction groups multiple database operations into a single unit.

By default, sqlite3 runs in "auto-commit" mode for DML statements (like INSERT), meaning each executemany() is its own transaction. We can manually control this for better performance.

import sqlite3
import time
# --- Setup (same as before) ---
conn = sqlite3.connect('users_fast.db')
cursor = conn.cursor()
cursor.execute('''
    CREATE TABLE IF NOT EXISTS users (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT NOT NULL,
        age INTEGER
    )
''')
conn.commit()
# --- End Setup ---
# Generate a large amount of dummy data
print("Generating 10,000 dummy records...")
large_dataset = [(f'User_{i}', i % 100) for i in range(10000)]
print("Generation complete.")
# --- Method 1: executemany with auto-commit (slower) ---
print("\n--- Method 1: executemany with auto-commit ---")
start_time = time.time()
cursor.executemany("INSERT INTO users (name, age) VALUES (?, ?)", large_dataset)
conn.commit() # This commit is for the single executemany call
end_time = time.time()
print(f"Inserted {cursor.rowcount} records in {end_time - start_time:.4f} seconds.")
# --- Method 2: executemany inside a manual transaction (faster) ---
print("\n--- Method 2: executemany inside a manual transaction ---")
start_time = time.time()
# Start a transaction
conn.execute("BEGIN TRANSACTION")
# Perform the operation
cursor.executemany("INSERT INTO users (name, age) VALUES (?, ?)", large_dataset)
# Commit the transaction
conn.commit()
end_time = time.time()
print(f"Inserted {cursor.rowcount} records in {end_time - start_time:.4f} seconds.")
conn.close()

When you run this, you will see a significant performance boost with Method 2. This is because the database engine can optimize the entire batch of inserts as a single atomic unit, drastically reducing disk I/O and transaction log overhead.

Advanced: Customizing `executemany` with a Generator

What if your data isn't in a neat list of tuples? executemany is flexible. You can pass any iterable, including a generator function.

Let's say you have a function that yields user data:

def generate_users(num_users):
    """A generator function that yields user data."""
    for i in range(num_users):
        yield (f'GeneratedUser_{i}', i % 50)
# Use the generator with executemany
conn = sqlite3.connect('users_generator.db')
cursor = conn.cursor()
cursor.execute("CREATE TABLE IF NOT EXISTS users (name TEXT, age INTEGER)")
# executemany will consume the generator one item at a time
cursor.executemany("INSERT INTO users (name, age) VALUES (?, ?)", generate_users(100))
conn.commit()
print(f"Inserted {cursor.rowcount} records from a generator.")
conn.close()

This pattern is very powerful for processing large datasets from files or other sources without loading everything into memory at once.

Summary

Feature	`execute()` in a loop	`executemany()`
Performance	Slow. High network/server overhead.	Fast. Batches operations into a single call.
Use Case	For running many different SQL statements.	For running the same SQL statement with many different parameters.
Code Style	More verbose.	More concise and readable for bulk operations.
Transactions	Each `execute()` is often a separate transaction.	Can be wrapped in a single transaction for maximum speed.
Flexibility	Can handle any SQL string in each iteration.	Requires the SQL statement to be the same for all executions.

In short, whenever you need to perform the same SQL operation (especially INSERT, UPDATE, DELETE) on multiple sets of data, always prefer executemany(). It is the idiomatic, efficient, and professional way to handle bulk database operations in Python.

Python executemany如何高效批量插入数据？

What is `executemany()`?

The Core Problem: Why Not Use a `for` Loop with `execute()`?

The Solution: `executemany()`

Example: Inserting Multiple Users with `executemany()`

Setup the Database

Use `executemany()` to Insert Data

`executemany()` vs. `executemany()` with a Transaction

Advanced: Customizing `executemany` with a Generator

Summary

99ANYc3cd6

Java byte转int为何需补码处理？

如何提升C语言程序设计能力？

Python SmartQQBot怎么用？

Win8安装虚拟机教程步骤有哪些？

魅蓝note2刷机教程难不难？

Java Socket编程实例具体怎么实现？

java的视频教程下载百度云

联想B480拆机教程详细吗？步骤清晰吗？

Java设置cookie时间怎么设置？

FlowJo10中文教程该怎么学？

java expression

Java session 丢失常见原因是什么？

Python中如何设置Cookie？

java list for循环

会声会影x4教程视频下载

amesim13安装教程

Python executemany如何高效批量插入数据？

What is executemany()?

The Core Problem: Why Not Use a for Loop with execute()?

The Solution: executemany()

Example: Inserting Multiple Users with executemany()

Setup the Database

Use executemany() to Insert Data

executemany() vs. executemany() with a Transaction

Advanced: Customizing executemany with a Generator

Summary

相关推荐

Java Socket编程实例具体怎么实现？

What is `executemany()`?

The Core Problem: Why Not Use a `for` Loop with `execute()`?

The Solution: `executemany()`

Example: Inserting Multiple Users with `executemany()`

Use `executemany()` to Insert Data

`executemany()` vs. `executemany()` with a Transaction

Advanced: Customizing `executemany` with a Generator