Python中decode与encode如何正确使用？-杰瑞科技汇

Of course! Let's break down Python's encode() and decode() methods. These are fundamental concepts for working with text and bytes in Python.

（图片来源网络，侵删）

The Core Problem: Text vs. Bytes

In Python, there are two main types for representing data:

str (String): A sequence of Unicode characters. This is how humans think about text. Python 3 strings are Unicode by default. Examples: "hello", "你好", .
bytes: A sequence of raw 8-bit bytes (integers from 0 to 255). This is how computers actually store and transmit data. Examples: b'hello', b'\xe4\xbd\xa0\xe5\xa5\xbd'.

The Golden Rule:

You can only write bytes objects to a file or send them over a network.
You can only perform text operations (like searching for a substring) on str objects.

encode() and decode() are the bridges between these two worlds.

`encode()`: From String (`str`) to Bytes (`bytes`)

You use encode() when you have a string and you need to convert it into a sequence of bytes to save it to a file or send it over a network.

（图片来源网络，侵删）

How it works:

my_string.encode(encoding)

my_string: The str object you want to convert.
encoding: (Optional, but highly recommended) The character encoding to use (e.g., 'utf-8', 'ascii', 'latin-1'). If you don't provide it, Python uses the system's default encoding, which can lead to unexpected behavior. Always specify it!

Example: Encoding "hello"

my_text = "hello world"
# Encode the string into bytes using UTF-8 encoding
my_bytes = my_text.encode('utf-8')
print(f"Original type: {type(my_text)}")
print(f"Original string: {my_text}")
print(f"\nEncoded type: {type(my_bytes)}")
print(f"Encoded bytes: {my_bytes}")

Output:

Original type: <class 'str'>
Original string: hello world
Encoded type: <class 'bytes'>
Encoded bytes: b'hello world'

For simple ASCII characters, the byte representation looks very similar.

Example: Encoding "你好" (Non-ASCII Characters)

This is where encoding becomes critical.

（图片来源网络，侵删）

my_text = "你好"
# Encode using UTF-8
utf8_bytes = my_text.encode('utf-8')
# Encode using GBK (another common encoding for Chinese)
gbk_bytes = my_text.encode('gbk')
print(f"Original string: {my_text}")
print(f"UTF-8 encoded bytes: {utf8_bytes}")
print(f"GBK encoded bytes:  {gbk_bytes}")

Output:

Original string: 你好
UTF-8 encoded bytes: b'\xe4\xbd\xa0\xe5\xa5\xbd'
GBK encoded bytes:  b'\xc4\xe3\xba\xc3'

Notice how the same text results in completely different byte sequences depending on the encoding. This is why specifying the correct encoding is so important!

`decode()`: From Bytes (`bytes`) to String (`str`)

You use decode() when you receive a sequence of bytes (from a file, a network request, etc.) and you want to convert it into a human-readable string.

How it works:

my_bytes.decode(encoding)

my_bytes: The bytes object you want to convert.
encoding: (Optional, but highly recommended) The character encoding that was used to create the bytes. You must use the same encoding that was used for encode(), or you'll get garbled text (called "mojibake").

Example: Decoding the "hello" bytes

my_bytes = b'hello world'
# Decode the bytes back into a string
my_text = my_bytes.decode('utf-8')
print(f"Original type: {type(my_bytes)}")
print(f"Original bytes: {my_bytes}")
print(f"\nDecoded type: {type(my_text)}")
print(f"Decoded string: {my_text}")

Output:

Original type: <class 'bytes'>
Original bytes: b'hello world'
Decoded type: <class 'str'>
Decoded string: hello world

Example: Decoding "你好" bytes

This shows what happens when you use the wrong encoding.

# These bytes were created using UTF-8 encoding
utf8_bytes = b'\xe4\xbd\xa0\xe5\xa5\xbd'
# Correctly decode with UTF-8
correct_text = utf8_bytes.decode('utf-8')
# Incorrectly decode with ASCII (will cause an error)
try:
    incorrect_text = utf8_bytes.decode('ascii')
except UnicodeDecodeError as e:
    print(f"Error decoding with ASCII: {e}")
print(f"Original bytes: {utf8_bytes}")
print(f"Correctly decoded (UTF-8): {correct_text}")

Output:

Error decoding with ASCII: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
Original bytes: b'\xe4\xbd\xa0\xe5\xa5\xbd'
Correctly decoded (UTF-8): 你好

The ASCII codec failed because the byte 0xe4 is not a valid ASCII character. This is why you must know the encoding of the bytes you are trying to decode.

Practical Workflow: Reading from a File

This is the most common use case. You read raw bytes from a file and then decode them into a string.

# 1. Open the file in binary mode ('rb') to read bytes
# Let's assume 'my_chinese_file.txt' contains the text "你好世界"
try:
    with open('my_chinese_file.txt', 'rb') as f:
        file_bytes = f.read()
        print(f"Read from file (bytes): {file_bytes}")
    # 2. Decode the bytes into a string
    # We must know the encoding the file was saved with. Let's assume it's UTF-8.
    file_content = file_bytes.decode('utf-8')
    print(f"Decoded content (string): {file_content}")
    print(f"Type of content: {type(file_content)}")
except FileNotFoundError:
    print("File not found. Creating a dummy one for demonstration.")
    # Create a dummy file to make the example runnable
    with open('my_chinese_file.txt', 'w', encoding='utf-8') as f:
        f.write("你好世界")
    # The code above will now work on the next run.

Output (assuming the file was created with UTF-8):

Read from file (bytes): b'\xe4\xbd\xa0\xe5\xa5\xbd\xe4\xb8\x96\xe7\x95\x8c'
Decoded content (string): 你好世界
Type of content: <class 'str'>

Summary Table

Method	Purpose	Input Type	Output Type	Common Use Case
`encode()`	Convert text to bytes for storage/transmission.	`str`	`bytes`	Preparing a string to be written to a file or sent over a network.
`decode()`	Convert bytes to text for processing.	`bytes`	`str`	Reading data from a file or a network request to make it readable.

Key Takeaways

str is for text, bytes is for data.
encode() turns str -> bytes.
decode() turns bytes -> str.
ALWAYS specify the encoding (like 'utf-8') when using these methods. UTF-8 is the modern standard and a safe default for most cases.
If you get a UnicodeDecodeError, you are likely trying to decode bytes with the wrong encoding.
If you get a UnicodeEncodeError, you are trying to encode a character that is not supported by your chosen encoding (e.g., trying to encode Chinese characters with 'ascii').

Python中decode与encode如何正确使用？

The Core Problem: Text vs. Bytes

`encode()`: From String (`str`) to Bytes (`bytes`)

How it works:

Example: Encoding "hello"

Example: Encoding "你好" (Non-ASCII Characters)

`decode()`: From Bytes (`bytes`) to String (`str`)

How it works:

Example: Decoding the "hello" bytes

Example: Decoding "你好" bytes

Practical Workflow: Reading from a File

Summary Table

Key Takeaways

99ANYc3cd6

如何定义Java中的ArrayList？

Java Builder模式究竟是什么？又该如何在实际开发中使用？

Java Integer 引用如何实现？

Python如何设置字体颜色？

Python httplib如何设置代理？

Java Socket编程实例具体怎么实现？

哪里能免费下载PS教程视频？

动态QQ头像怎么制作？

Photoshop磨皮教程，新手如何快速学会？

联想G470拆键盘教程视频怎么做？

Java如何用XSD生成XML？

PPT制作步骤视频教程，怎么学最快？

QQ同步助手Java 2.0有何新特性？

Linux下如何查看Java版本？

Java中如何插入timestamp？

python mysqldb pip

Python中decode与encode如何正确使用？

The Core Problem: Text vs. Bytes

encode(): From String (str) to Bytes (bytes)

How it works:

Example: Encoding "hello"

Example: Encoding "你好" (Non-ASCII Characters)

decode(): From Bytes (bytes) to String (str)

How it works:

Example: Decoding the "hello" bytes

Example: Decoding "你好" bytes

Practical Workflow: Reading from a File

Summary Table

Key Takeaways

相关推荐

Java Socket编程实例具体怎么实现？

`encode()`: From String (`str`) to Bytes (`bytes`)

`decode()`: From Bytes (`bytes`) to String (`str`)