杰瑞科技汇

executemany python

Of course! Let's dive deep into executemany in Python, a very useful method for database operations.

executemany python-图1
(图片来源网络,侵删)

What is executemany?

executemany() is a method provided by Python's database cursor objects (from libraries like sqlite3, psycopg2, mysql-connector-python, etc.). Its primary purpose is to execute a single SQL statement multiple times, with different sets of parameters for each execution.

Think of it as a highly efficient way to perform a bulk insert or update operation.


Why Use executemany? (The Key Benefit: Performance)

The main reason to use executemany is for performance optimization.

Consider the alternative: inserting 1000 rows into a database.

executemany python-图2
(图片来源网络,侵删)

The Inefficient Way (Using a Loop):

# Inefficient: Making 1000 separate database calls
for i in range(1000):
    cursor.execute("INSERT INTO products (name, price) VALUES (?, ?)", 
                   (f"Product {i}", i * 10.50))
connection.commit()

In this approach, your Python code communicates with the database 1000 times. Each communication (network round-trip) has overhead, making this process very slow.

The Efficient Way (Using executemany):

# Efficient: Making 1 database call for 1000 rows
data_to_insert = [
    (f"Product {i}", i * 10.50) 
    for i in range(1000)
]
cursor.executemany("INSERT INTO products (name, price) VALUES (?, ?)", 
                   data_to_insert)
connection.commit()

Here, you package all 1000 rows of data into a list and send it to the database in a single call. The database can then process all the inserts in one optimized transaction. This can be orders of magnitude faster than the loop approach.

executemany python-图3
(图片来源网络,侵删)

How to Use executemany: A Step-by-Step Example

Let's use Python's built-in sqlite3 module for a clear, dependency-free example.

Step 1: Setup and Connection

First, we'll create a connection to a SQLite database and a table to work with.

import sqlite3
# Connect to the database (it will be created if it doesn't exist)
connection = sqlite3.connect('company.db')
cursor = connection.cursor()
# Create a table for employees
cursor.execute('''
    CREATE TABLE IF NOT EXISTS employees (
        id INTEGER PRIMARY KEY,
        name TEXT NOT NULL,
        department TEXT NOT NULL,
        salary REAL
    )
''')
connection.commit()

Step 2: Prepare the Data

executemany requires a list (or any iterable) of tuples or sequences. Each tuple represents one row of data to be inserted. The order of the values in the tuple must match the order of the placeholders ( or %s) in your SQL statement.

# A list of employee records to insert
new_employees = [
    ('Alice', 'Engineering', 95000.00),
    ('Bob', 'Marketing', 75000.00),
    ('Charlie', 'Sales', 80000.00),
    ('Diana', 'Engineering', 110000.00),
    ('Eve', 'HR', 72000.00)
]

Step 3: Execute the executemany Statement

Now, we call executemany on our cursor.

# The SQL statement with a placeholder for each value
sql_insert_query = "INSERT INTO employees (name, department, salary) VALUES (?, ?, ?)"
# Execute the query for all employee records in the list
cursor.executemany(sql_insert_query, new_employees)
# Commit the transaction to save the changes to the database
connection.commit()
print(f"{cursor.rowcount} records inserted successfully into employees table.")

Explanation:

  • sql_insert_query: The SQL command with as placeholders. This is the same command that will be executed repeatedly.
  • new_employees: The list of data. executemany will iterate through this list.
  • For each tuple in new_employees (e.g., ('Alice', 'Engineering', 95000.00)), it will substitute the values into the placeholders and execute the command.
  • cursor.rowcount tells you how many rows were affected by the last operation. For executemany, this is the total number of rows processed.

Step 4: Verify and Close

Let's check that the data was inserted correctly and then close the connection.

# Verify the data
cursor.execute("SELECT * FROM employees")
rows = cursor.fetchall()
print("\n--- All Employees ---")
for row in rows:
    print(row)
# Close the connection
connection.close()

Important Considerations and Best Practices

  1. SQL Injection:

    • Always use placeholders ( for sqlite3, %s for psycopg2/mysql-connector).
    • NEVER use Python's string formatting (f-strings, .format(), operator) to insert variables directly into your SQL string. This is a massive security risk that can lead to SQL injection attacks.
    # ❌ DANGEROUS - DO NOT DO THIS
    name = "Alice"
    # cursor.execute(f"INSERT INTO employees (name) VALUES ('{name}')") 
    # ✅ SAFE - The correct way
    cursor.execute("INSERT INTO employees (name) VALUES (?)", (name,))
  2. Committing the Transaction:

    • Don't forget to call connection.commit() after your executemany call. If you don't, the changes will not be saved to the database. They will be rolled back when the connection is closed.
  3. Handling Errors:

    • If one of the insert operations fails (e.g., due to a duplicate key constraint violation), the behavior can vary by database driver.
    • Some drivers might stop and raise an error after the first failure. Others might continue and then report how many were successful.
    • For more complex error handling, you might need to wrap your executemany call in a try...except block and manage transactions manually (connection.begin(), connection.rollback()).
  4. Alternatives:

    • executemany vs. executemany with executemanymany: Some database drivers (like psycopg2) have an executemanymany method for even more advanced bulk operations, often with copy protocols that are even faster.
    • Bulk Insert SQL: Many databases (like PostgreSQL, MySQL) have their own highly optimized SQL commands for bulk loading (e.g., COPY in PostgreSQL, LOAD DATA INFILE in MySQL). These are often the fastest possible method but are database-specific.

executemany vs. executemany

This is a common point of confusion.

Feature cursor.execute(sql, params) cursor.executemany(sql, seq_of_params)
Purpose Executes a single SQL statement. Executes a single SQL statement multiple times.
Parameters Takes one set of parameters (a tuple or dict). Takes a sequence (list, tuple) of parameter sets.
Use Case Inserting/updating one row, running a query with one set of filters. Inserting/updating many rows in a single, efficient call.
cursor.rowcount Number of rows affected by that one execution. Total number of rows affected by all executions.

Analogy:

  • execute: Firing a single, precise arrow at a target.
  • executemany: Loading a quiver with many arrows and firing them all at once at the same target.

Summary

Method When to Use Performance
cursor.execute() in a loop For a very small number of operations, or when each operation is unique. Poor. High overhead from multiple network trips.
cursor.executemany() The standard, recommended way for bulk inserts/updates of similar data. Excellent. Minimizes network trips and is highly optimized.
Database-specific bulk commands (e.g., COPY) When you need the absolute maximum performance for very large datasets. The absolute best. Often bypasses much of the standard SQL overhead.

For almost all general-purpose bulk data operations in Python, executemany is the perfect tool for the job.

分享:
扫描分享到社交APP
上一篇
下一篇