python byte string-杰瑞科技汇

Of course! Let's dive deep into Python's byte strings. This is a fundamental topic for anyone working with data that isn't plain text, such as network protocols, file I/O, or cryptography.

What is a Byte String?

In Python, a byte string is a sequence of bytes. It's an immutable sequence of integers in the range 0 <= x < 256. The key takeaway is that bytes are raw, 8-bit values. They don't inherently have an encoding like text does.

Think of it like this:

str (String): A sequence of Unicode characters (e.g., 'H', 'e', 'l', 'l', 'o'). To store it or send it over a network, it must be encoded into bytes.
bytes (Byte String): A sequence of raw bytes (e.g., 72, 101, 108, 108, 111). It's the machine-readable representation of data.

Creating Byte Strings

There are several ways to create a bytes object.

a) From a Literal

You can create a byte string using a literal by prefixing a string with b. The characters you can use are limited to ASCII characters.

# Using a literal
data = b'hello world'
print(data)
# Output: b'hello world'
# Each element is an integer representing a byte
print(data[0])
# Output: 104  (which is 'h' in ASCII)
print(list(data))
# Output: [104, 101, 108, 108, 111]

Important: You cannot use non-ASCII characters directly in a byte literal.

# This will cause a SyntaxError
# data = b'café'

b) From an Iterable of Integers

You can create a byte string from any iterable of integers where each integer is between 0 and 255.

# From a list of integers
byte_data = bytes([65, 66, 67, 255])
print(byte_data)
# Output: b'ABC\xff'  # \xff is the hex representation for 255
# From a range
byte_range = bytes(range(65, 68)) # 65, 66, 67
print(byte_range)
# Output: b'ABC'

c) From an Existing String (Encoding)

This is the most common way you'll create byte strings: by encoding a str object. You must specify an encoding (like UTF-8, ASCII, etc.).

text_string = "Hello, 世界!" # A Unicode string
# Encode the string into bytes using UTF-8 encoding
utf8_bytes = text_string.encode('utf-8')
print(utf8_bytes)
# Output: b'Hello, \xe4\xb8\x96\xe7\x95\x8c!'
# Encode using a different encoding, like Latin-1 (ISO-8859-1)
latin1_bytes = text_string.encode('latin-1')
print(latin1_bytes)
# Output: b'Hello, \xa4\x96\x8c!'

Converting Back to Strings (Decoding)

To process a byte string as text, you must decode it into a str object. Again, you must know the correct encoding.

# The byte string from the previous example
utf8_bytes = b'Hello, \xe4\xb8\x96\xe7\x95\x8c!'
# Decode the bytes back into a string
text_string = utf8_bytes.decode('utf-8')
print(text_string)
# Output: Hello, 世界!
# Decoding with the wrong encoding leads to errors or "mojibake"
try:
    # This will fail because the bytes are not valid Latin-1
    wrong_decode = utf8_bytes.decode('latin-1')
except UnicodeDecodeError as e:
    print(f"Error: {e}")
    # Output: Error: 'utf-8' codec can't decode byte 0xe4 in position 7: invalid continuation byte

Key Operations and Methods

bytes objects are very similar to str objects. They support many of the same operations.

Slicing and Indexing

data = b'hello world'
print(data[1:4])
# Output: b'ell'
print(data[0])
# Output: 104

Concatenation and Repetition

b1 = b'hello '
b2 = b'world'
print(b1 + b2)
# Output: b'hello world'
print(b1 * 2)
# Output: b'hello hello '

Common Methods

data = b'  Hello World  '
# Find a byte (by integer or ASCII character)
print(data.find(b'W')) # Find the byte for 'W'
# Output: 7
# Startswith and Endswith
print(data.startswith(b'Hello'))
# Output: False (because of the leading spaces)
print(data.endswith(b'World'))
# Output: False (because of the trailing spaces)
# Strip, Lower, Upper
print(data.strip())
# Output: b'Hello World'
print(data.lower())
# Output: b'  hello world  '
# Count occurrences
print(data.count(b'l'))
# Output: 3

Mutable Counterpart: `bytearray`

Sometimes you need a mutable sequence of bytes. For this, Python provides the bytearray type. It has almost all the methods of bytes but also allows in-place modifications.

# Create a bytearray
ba = bytearray(b'hello')
print(ba)
# Output: bytearray(b'hello')
# Change an item in place
ba[0] = 74 # J in ASCII
print(ba)
# Output: bytearray(b'jello')
# Append a new byte
ba.append(33) # ! in ASCII
print(ba)
# Output: bytearray(b'jello!')
# You can also slice and assign
ba[1:3] = b'ea'
print(ba)
# Output: bytearray(b'jeallo!')

Practical Use Cases

Understanding when to use bytes is crucial.

Network Programming

When you send data over a network (e.g., using the socket library), you are sending raw bytes.

# This is a conceptual example
import socket
host = 'example.com'
port = 80
# Create a socket
# (In a real app, you'd use 'with' for proper cleanup)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# The HTTP request must be sent as bytes
# Note the '\r\n' line endings are part of the HTTP protocol
http_request = (
    f"GET / HTTP/1.1\r\n"
    f"Host: {host}\r\n"
    f"Connection: close\r\n"
    f"\r\n"
).encode('utf-8')
s.connect((host, port))
s.sendall(http_request)
# The response you receive will also be in bytes
response = s.recv(4096)
print(response) # This is a bytes object
s.close()

File I/O (Binary Mode)

When reading or writing non-text files (images, videos, executables, etc.), you must open the file in binary mode ('rb' for read binary, 'wb' for write binary).

# Writing to a binary file
data_to_write = b'\x89PNG\r\n\x1a\n' # The first few bytes of a PNG file
with open('image.png', 'wb') as f:
    f.write(data_to_write)
# Reading from a binary file
with open('image.png', 'rb') as f:
    read_data = f.read()
    print(read_data)
    # Output: b'\x89PNG\r\n\x1a\n'

Cryptography and Hashing

Cryptographic libraries work with raw binary data.

import hashlib
message = "A secret message".encode('utf-8') # Must be bytes
sha256_hash = hashlib.sha256(message).hexdigest()
print(f"SHA-256 Hash: {sha256_hash}")
# Output: SHA-256 Hash: 3a6eb0790f39ac87c94f3856b2dd2c5d110e6811602261a9a923d3bb231658b

Summary Table: `str` vs. `bytes`

Feature	`str` (String)	`bytes` (Byte String)
Purpose	Human-readable text	Raw binary data (machine-readable)
Content	Unicode characters	Integers from 0 to 255
Encoding	Has an encoding (e.g., UTF-8)	No inherent encoding. It is the encoded form.
Creation	`'hello'`, `"world"`	`b'hello'`, `bytes([72, 101])`, `'hello'.encode('utf-8')`
Conversion	`.encode(encoding)`	`.decode(encoding)`
Mutability	Immutable	Immutable (but `bytearray` is a mutable alternative)
Use Cases	Displaying text, user input	Network, files, cryptography, images

python byte string

What is a Byte String?

Creating Byte Strings

a) From a Literal

b) From an Iterable of Integers

c) From an Existing String (Encoding)

Converting Back to Strings (Decoding)

Key Operations and Methods

Slicing and Indexing

Concatenation and Repetition

Common Methods

Mutable Counterpart: `bytearray`

Practical Use Cases

Network Programming

File I/O (Binary Mode)

Cryptography and Hashing

Summary Table: `str` vs. `bytes`

99ANYc3cd6

python str unicode

Sublime Mac下Python环境怎么配置？

python unicode str

CorelDRAW官方教程怎么学？

Java可变与不可变对象，如何选择？

Java Socket编程实例具体怎么实现？

OpenStack视频教程适合零基础入门吗？

java join array

信息安全等级测评师培训教程怎么学？

JSP中Java的import该怎么用？

Python datetime时间戳如何转换与计算？

Linux下如何安装mysql-python？

java 正则表达式身份证号

primer5引物设计步骤有哪些？

Python sort()中lambda如何自定义排序规则？

Python collections模块有哪些核心功能？

python byte string

What is a Byte String?

Creating Byte Strings

a) From a Literal

b) From an Iterable of Integers

c) From an Existing String (Encoding)

Converting Back to Strings (Decoding)

Key Operations and Methods

Slicing and Indexing

Concatenation and Repetition

Common Methods

Mutable Counterpart: bytearray

Practical Use Cases

Network Programming

File I/O (Binary Mode)

Cryptography and Hashing

Summary Table: str vs. bytes

相关推荐

Java Socket编程实例具体怎么实现？

Mutable Counterpart: `bytearray`

Summary Table: `str` vs. `bytes`