杰瑞科技汇

Python opener header如何设置?

Of course! The term "Python opener header" can be interpreted in a few ways, but the most common meaning refers to setting custom HTTP headers when making web requests.

Python opener header如何设置?-图1
(图片来源网络,侵删)

I'll cover the three most popular Python libraries for this, from the simplest to the most powerful:

  1. urllib.request (Built-in, no installation needed)
  2. requests (The de-facto standard, highly recommended)
  3. http.client (Low-level, more complex)

urllib.request (Standard Library)

This is the built-in, no-frills way to make HTTP requests. You add headers to a Request object before opening it.

Key Concept:

You create a Request object, pass your headers to it, and then pass that object to urlopen().

Example:

Let's add a User-Agent header (to identify our script as a browser) and a custom X-Custom-Header.

Python opener header如何设置?-图2
(图片来源网络,侵删)
import urllib.request
import json
# The URL you want to send a request to
url = 'https://httpbin.org/get'  # This is a great testing service
# Create a dictionary of headers
# It's good practice to set a User-Agent
headers = {
    'User-Agent': 'MyCoolPythonScript/1.0',
    'X-Custom-Header': 'This is some custom data',
    'Accept': 'application/json' # We want to receive JSON data
}
# Create a Request object with the URL and headers
req = urllib.request.Request(url, headers=headers)
try:
    # Open the request and read the response
    with urllib.request.urlopen(req) as response:
        # The response is a bytes object, so we decode it to a string
        response_body = response.read().decode('utf-8')
        # Since we asked for JSON, let's parse it
        data = json.loads(response_body)
        # Print the headers that the server received
        print("Headers sent to the server:")
        print(json.dumps(data['headers'], indent=2))
except urllib.error.URLError as e:
    print(f"An error occurred: {e.reason}")

requests Library (Highly Recommended)

The requests library is the community favorite because it simplifies everything. The syntax is much cleaner and more intuitive.

Key Concept:

You pass a dictionary of headers directly to the headers parameter in any requests function (get, post, etc.).

Installation:

First, you need to install it:

pip install requests

Example:

Let's do the same thing as the urllib example: add a User-Agent and a custom X-Custom-Header.

Python opener header如何设置?-图3
(图片来源网络,侵删)
import requests
import json
# The URL you want to send a request to
url = 'https://httpbin.org/get'
# Create a dictionary of headers
headers = {
    'User-Agent': 'MyCoolPythonScript/1.0',
    'X-Custom-Header': 'This is some custom data',
    'Accept': 'application/json'
}
try:
    # Make the GET request, passing the headers dictionary
    response = requests.get(url, headers=headers)
    # Raise an exception if the request was unsuccessful (e.g., 404, 500)
    response.raise_for_status() 
    # The response.json() method automatically parses the JSON response
    data = response.json()
    # Print the headers that the server received
    print("Headers sent to the server:")
    print(json.dumps(data['headers'], indent=2))
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

As you can see, the requests version is more readable and handles many things (like JSON parsing) automatically.


http.client (Low-Level)

This is the underlying library that urllib uses. It's more complex and verbose, but gives you fine-grained control. It's generally not needed unless you have very specific requirements.

Key Concept:

You create a connection object, then use .putrequest() to start the request, .putheader() to add each header, and then .endheaders() to finalize the header block before sending the body.

Example:

This example demonstrates how to add headers manually using the http.client module.

import http.client
import json
# The URL and path
host = 'httpbin.org'
path = '/get'
# Create a connection object (use HTTPSConnection for secure sites)
# The context parameter disables SSL certificate verification for this example.
# In production, you should use a proper SSL context.
conn = http.client.HTTPSConnection(host, context=http.client._create_unverified_context())
# Create a dictionary of headers
headers = {
    'User-Agent': 'MyCoolPythonScript/1.0',
    'X-Custom-Header': 'This is some custom data',
    'Accept': 'application/json'
}
try:
    # Start the request. We are sending a GET request.
    conn.putrequest('GET', path)
    # Add each header to the request
    for key, value in headers.items():
        conn.putheader(key, value)
    # This is required to signal that headers are done
    conn.endheaders()
    # Get the response from the server
    response = conn.getresponse()
    print(f"Response status: {response.status} {response.reason}")
    # Read the response data
    response_data = response.read().decode('utf-8')
    # Parse the JSON
    data = json.loads(response_data)
    # Print the headers that the server received
    print("\nHeaders sent to the server:")
    print(json.dumps(data['headers'], indent=2))
except http.client.HTTPException as e:
    print(f"An HTTP error occurred: {e}")
finally:
    # Always close the connection
    conn.close()

Summary: Which one should you use?

Library Pros Cons Best For
urllib.request - Built-in (no installation)
- Good for simple scripts
- Verbose syntax
- Less intuitive
Quick scripts, environments where you can't install packages, or for learning the fundamentals.
requests - Highly Recommended
- Simple, clean, "Pythonic" API
- Automatic JSON handling
- Better session management
- Requires an external installation (pip install requests) Almost all use cases. Web scraping, APIs, any general-purpose HTTP interaction.
http.client - Fine-grained control
- No external dependencies
- Powerful for complex protocols
- Very verbose and low-level
- Error-prone manual work
Advanced use cases, custom protocols, or when you need to bypass abstractions in requests/urllib.
分享:
扫描分享到社交APP
上一篇
下一篇