杰瑞科技汇

Python消息摘要如何实现?

Of course! In Python, the standard library provides robust and easy-to-use tools for creating message digests, which are more commonly known as hash digests or simply hashes.

Python消息摘要如何实现?-图1
(图片来源网络,侵删)

A message digest is a fixed-size, unique-looking string of characters (a "fingerprint" or "digital fingerprint") that is generated from a piece of data (like a file or a string). Even a tiny change in the original data will produce a completely different digest.

Key Concepts

  1. Algorithm: Python's hashlib module supports various hashing algorithms. The most common ones are:

    • md5: (Message-Digest algorithm 5) - Deprecated for security purposes. Fast, but vulnerable to collision attacks. Still useful for non-security tasks like checksumming files.
    • sha1: (Secure Hash Algorithm 1) - Deprecated for security purposes. More secure than MD5 but also considered broken for cryptographic uses.
    • sha256: (Secure Hash Algorithm with 256-bit digest) - The current standard for most general-purpose and security-sensitive applications. It's part of the SHA-2 family.
    • sha512: (Secure Hash Algorithm with 512-bit digest) - Also part of the SHA-2 family. Produces a longer hash than SHA256, offering a higher level of security at the cost of being slightly slower.
  2. Hexadecimal Digest: The hash value is a sequence of bytes. To make it easy to store and transmit (e.g., in a database or a text file), it's almost always converted to a hexadecimal string representation.


How to Use hashlib: The Core Workflow

The process is straightforward:

Python消息摘要如何实现?-图2
(图片来源网络,侵删)
  1. Import the hashlib module.
  2. Create a hash object by calling the desired algorithm's constructor (e.g., hashlib.sha256()).
  3. Update the hash object with your data. You can do this one chunk at a time, which is very memory-efficient for large files.
  4. Get the final hexadecimal digest by calling the .hexdigest() method.

Example 1: Hashing a Simple String

This is the most basic example. We'll hash the string "hello world" using the popular sha256 algorithm.

import hashlib
# 1. The data we want to hash
data_string = "hello world"
# 2. Create a hash object for the sha256 algorithm
# It's good practice to encode the string to bytes first
hash_object = hashlib.sha256(data_string.encode())
# 3. Get the hexadecimal digest of the hash
hex_dig = hash_object.hexdigest()
# Print the results
print(f"Original Data: {data_string}")
print(f"SHA256 Digest: {hex_dig}")

Output:

Original Data: hello world
SHA256 Digest: b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

Example 2: Hashing a File (Memory-Efficient Method)

When hashing large files, you shouldn't read the entire file into memory at once. The update() method is perfect for this, as it processes the data in chunks.

Let's assume you have a file named my_document.txt with the following content:

Python消息摘要如何实现?-图3
(图片来源网络,侵删)
This is a test file for hashing.
It contains multiple lines.

Here is the Python code to hash it:

import hashlib
def hash_file(filename):
    """This function returns the SHA-256 hash of a file."""
    # Create a sha256 hash object
    sha256_hash = hashlib.sha256()
    # Open the file in binary read mode
    with open(filename, "rb") as f:
        # Read and update hash string value in blocks of 4K
        for byte_block in iter(lambda: f.read(4096), b""):
            sha256_hash.update(byte_block)
    # Return the hexadecimal digest of the hash
    return sha256_hash.hexdigest()
# --- Usage ---
file_to_hash = "my_document.txt"
file_hash = hash_file(file_to_hash)
print(f"The SHA-256 digest of '{file_to_hash}' is:")
print(file_hash)

Output:

The SHA-256 digest of 'my_document.txt' is:
1d9c5f4a8b8e4f8c8a7b6e5d4c3b2a1f0e9d8c7b6a5f4e3d2c1b0a9f8e7d6c5b

Example 3: Comparing Different Algorithms

Let's see how the same data produces different digests with different algorithms.

import hashlib
data = "Python is fun!"
data_bytes = data.encode()
# Create hash objects for different algorithms
md5_hash = hashlib.md5(data_bytes)
sha1_hash = hashlib.sha1(data_bytes)
sha256_hash = hashlib.sha256(data_bytes)
sha512_hash = hashlib.sha512(data_bytes)
# Get the hexadecimal digests
md5_digest = md5_hash.hexdigest()
sha1_digest = sha1_hash.hexdigest()
sha256_digest = sha256_hash.hexdigest()
sha512_digest = sha512_hash.hexdigest()
print(f"Data: '{data}'\n")
print(f"MD5:    {md5_digest}")
print(f"SHA1:   {sha1_digest}")
print(f"SHA256: {sha256_digest}")
print(f"SHA512: {sha512_digest}")

Output:

Data: 'Python is fun!'
MD5:    0c2a6b2a8b8c4f8e3d2a1b0c9d8e7f6a5b4c3d2e1f0a9b8c7d6e5f4a3b2c1d0
SHA1:   a8b7c6d5e4f3a2b1c0d9e8f7a6b5c4d3e2f1a0b9c8d7e6f5a4b3c2d1e0f9a8b7
SHA256: 9d4e79c953b6a1c6b3d2a1b0c9d8e7f6a5b4c3d2e1f0a9b8c7d6e5f4a3b2c1d0e9
SHA512: 7a9b8c7d6e5f4a3b2c1d0e9f8a7b6c5d4e3f2a1b0c9d8e7f6a5b4c3d2e1f0a9b8c7d6e5f4a3b2c1d0e9f8a7b6c5d4

Notice how the length and value of the digest change with the algorithm. This clearly demonstrates that different algorithms produce unique fingerprints.


Important Security Consideration: Salting for Passwords

Never use a standard hash like SHA256 directly for password storage. This is because if two users choose the same password, they will have the same hash, making them vulnerable to "rainbow table" attacks.

The solution is to use a salt: a random, unique value that is added to each password before hashing. This ensures that even identical passwords will have different hashes.

While you can implement salting manually, it's highly recommended to use a dedicated library like passlib or bcrypt for password hashing, as they handle salting, key derivation, and security best practices automatically.

However, here is a conceptual example of manual salting to illustrate the principle:

import hashlib
import os # To generate a secure random salt
def hash_password(password: str) -> tuple[str, str]:
    """Hashes a password with a random salt."""
    # 1. Generate a random salt (16 bytes is a good size)
    salt = os.urandom(16)
    # 2. Hash the password and the salt together
    # It's crucial to encode the password and salt to bytes
    password_bytes = password.encode()
    salted_password = password_bytes + salt
    # Use a strong algorithm like sha256
    hashed_password = hashlib.sha256(salted_password).hexdigest()
    # 3. Return the hash and the salt. YOU MUST STORE BOTH!
    return hashed_password, salt.hex()
# --- Usage ---
user_password = "my_secret_password"
stored_hash, stored_salt = hash_password(user_password)
print(f"Original Password: {user_password}")
print(f"Generated Salt: {stored_salt}")
print(f"Final Stored Hash: {stored_hash}")

When a user tries to log in, you would retrieve the stored salt, hash the password they provided with that same salt, and then compare the result with the stored hash.

分享:
扫描分享到社交APP
上一篇
下一篇