Of course! The cv2.warpAffine function is a cornerstone of image processing in Python using the OpenCV library. It's used to perform a 2D affine transformation on an image.
Let's break it down, from the high-level concept to practical examples.
What is an Affine Transformation?
An affine transformation is a geometric transformation that preserves points, straight lines, and planes. It does not necessarily preserve angles or lengths. It can be composed of three basic operations:
- Translation (Shifting): Moving an image without rotating or scaling it.
- Rotation: Turning an image around a specific point.
- Scaling (Resizing): Making an image larger or smaller.
Any combination of these operations (like rotating and then scaling) is also an affine transformation.
The key takeaway is that parallel lines remain parallel after an affine transformation. This makes it perfect for tasks like correcting perspective in scanned documents or aligning images.
The cv2.warpAffine Function
The function signature is:
dst = cv2.warpAffine(src, M, dsize, flags, borderMode, borderValue)
Let's look at the most important parameters:
| Parameter | Type | Description |
|---|---|---|
src |
numpy.ndarray |
The source image (the input image). |
M |
numpy.ndarray |
The 2x3 transformation matrix. This is the most crucial part. It defines the rotation, scaling, shearing, and translation. |
dsize |
(int, int) |
The size of the output image as a tuple (width, height). This is required because after rotation or scaling, the image dimensions will change. |
flags |
int |
(Optional) Interpolation method. Common values are cv2.INTER_LINEAR (default, good quality/speed), cv2.INTER_NEAREST (fastest, low quality), cv2.INTER_CUBIC (slowest, high quality). |
borderMode |
int |
(Optional) Pixel extrapolation method. How to handle pixels outside the original image boundaries. Common is cv2.BORDER_CONSTANT. |
borderValue |
tuple |
(Optional) The value to use in case of a constant border, e.g., (0,0,0) for black. |
The Transformation Matrix M
The M matrix is a 2x3 matrix that looks like this:
M = [ m00 m01 m02 ]
[ m10 m11 m12 ]
This matrix combines linear transformations (scaling, rotation, shearing) and translation.
| Component | Transformation |
|---|---|
[ m00, m01 ] |
X-axis transformation (scaling, shearing) |
[ m10, m11 ] |
Y-axis transformation (scaling, shearing) |
[ m02, m12 ] |
Translation (shift in X and Y) |
You almost never create this matrix by hand. Instead, you use helper functions from OpenCV.
How to Create the Transformation Matrix M
OpenCV provides functions to generate the M matrix for common transformations.
Translation (Shifting)
You use np.float32 to create the matrix.
import cv2
import numpy as np
# Read an image
img = cv2.imread('image.jpg')
# Define the shift amount (x, y)
shift_x = 50 # Shift right by 50 pixels
shift_y = 30 # Shift down by 30 pixels
# Create the translation matrix M
# M = [ [1, 0, tx], [0, 1, ty] ]
M = np.float32([[1, 0, shift_x], [0, 1, shift_y]])
# Get the image dimensions
h, w = img.shape[:2]
# Apply the translation
# The output image size is the same as the input
translated_img = cv2.warpAffine(img, M, (w, h))
# Display the result
cv2.imshow('Original', img)
cv2.imshow('Translated', translated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Rotation
For rotation, you need to specify:
angle: The rotation angle in degrees.center: The point around which to rotate (e.g., the image center).scale: An optional scaling factor.
# Read an image
img = cv2.imread('image.jpg')
# Get image dimensions
h, w = img.shape[:2]
# Define the rotation center (e.g., the center of the image)
center = (w // 2, h // 2)
# Define the rotation angle and scale
angle = 45 # degrees
scale = 1.0
# Get the rotation matrix M
# cv2.getRotationMatrix2D(center, angle, scale)
M = cv2.getRotationMatrix2D(center, angle, scale)
# The output size might change after rotation,
# so we calculate the new bounding box size
new_w = int(w * abs(np.cos(np.radians(angle))) + h * abs(np.sin(np.radians(angle))))
new_h = int(h * abs(np.cos(np.radians(angle))) + w * abs(np.sin(np.radians(angle))))
# Apply the rotation
rotated_img = cv2.warpAffine(img, M, (new_w, new_h))
# Display the result
cv2.imshow('Original', img)
cv2.imshow('Rotated', rotated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Scaling (Resizing)
You can use cv2.resize() for simple resizing, but warpAffine is useful when you want to combine scaling with other transformations. To just scale, you create a matrix.
# Read an image
img = cv2.imread('image.jpg')
# Get image dimensions
h, w = img.shape[:2]
# Define the scaling factors (fx for width, fy for height)
fx = 1.5 # Scale width by 1.5x
fy = 0.7 # Scale height by 0.7x
# Create the scaling matrix M
# M = [ [fx, 0, 0], [0, fy, 0] ]
M = np.float32([[fx, 0, 0], [0, fy, 0]])
# The output size is the original size multiplied by the scale factors
new_w = int(w * fx)
new_h = int(h * fy)
# Apply the scaling
scaled_img = cv2.warpAffine(img, M, (new_w, new_h))
# Display the result
cv2.imshow('Original', img)
cv2.imshow('Scaled', scaled_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Practical Example: Rotating an Image Around its Center
This is a very common task. The key is to calculate the new image size to avoid cropping.
import cv2
import numpy as np
# Load the image
img = cv2.imread('your_image.jpg')
if img is None:
print("Error: Could not load image.")
exit()
# Get image dimensions
(h, w) = img.shape[:2]
# 1. Define the rotation center (center of the image)
center = (w // 2, h // 2)
# 2. Get the rotation matrix
angle = -45 # Negative angle for clockwise rotation
scale = 1.0
M = cv2.getRotationMatrix2D(center, angle, scale)
# 3. Calculate the new image size to fit the rotated image
# Using trigonometry to find the new bounding box
cos = abs(M[0, 0])
sin = abs(M[0, 1])
new_w = int((h * sin) + (w * cos))
new_h = int((h * cos) + (w * sin))
# 4. Adjust the rotation matrix to account for the translation
# This ensures the rotated image is centered in the new canvas
M[0, 2] += (new_w / 2) - center[0]
M[1, 2] += (new_h / 2) - center[1]
# 5. Apply the affine transformation
rotated_img = cv2.warpAffine(img, M, (new_w, new_h))
# Display
cv2.imshow('Original Image', img)
cv2.imshow('Rotated Image', rotated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Handling Borders
What happens to the pixels that are "left over" after a transformation? The borderMode and borderValue parameters control this.
cv2.BORDER_CONSTANT: Fills the border with a constant color (specified byborderValue). This is the most common.cv2.BORDER_REPLICATE: Replicates the last pixel value.cv2.BORDER_REFLECT: Reflects the image like a mirror (e.g.,abcdefgh->badcbabcd).
# Example with a black border
img = cv2.imread('image.jpg')
M = cv2.getRotationMatrix2D((img.shape[1]//2, img.shape[0]//2), 45, 1.0)
# Calculate new size
h, w = img.shape[:2]
cos = abs(M[0, 0])
sin = abs(M[0, 1])
new_w = int((h * sin) + (w * cos))
new_h = int((h * cos) + (w * sin))
M[0, 2] += (new_w / 2) - (w / 2)
M[1, 2] += (new_h / 2) - (h / 2)
# Apply with a black border
rotated_with_border = cv2.warpAffine(img, M, (new_w, new_h), borderMode=cv2.BORDER_CONSTANT, borderValue=(0,0,0))
cv2.imshow('With Black Border', rotated_with_border)
cv2.waitKey(0)
cv2.destroyAllWindows()
Summary
| Use Case | How to Get M |
Key dsize Consideration |
|---|---|---|
| Translation | np.float32([[1, 0, tx], [0, 1, ty]]) |
Same as input (w, h) |
| Rotation | cv2.getRotationMatrix2D(center, angle, scale) |
Must be calculated to avoid cropping |
| Scaling | np.float32([[fx, 0, 0], [0, fy, 0]]) |
(int(w*fx), int(h*fy)) |
| Combined | Combine matrices with cv2.add() or build manually |
Must be calculated for the final result |
cv2.warpAffine is an incredibly powerful and flexible tool. Once you understand how the transformation matrix M works and how to generate it, you can perform complex geometric manipulations on your images.
