杰瑞科技汇

Python中如何获取requesturl?

Of course! The term "requesturl" isn't a standard Python library, but it's very likely a typo or a shorthand for requests.url.

Python中如何获取requesturl?-图1
(图片来源网络,侵删)

The requests library is the de facto standard for making HTTP requests in Python. When you make a request, the requests library returns a Response object, and this object has an attribute called .url.

This attribute tells you the final URL that the request was actually made to, which is incredibly useful for things like following redirects.

Let's break it down.


The Core Concept: requests.Response.url

When you use the requests library, you send a request to a URL. The server might respond with a "redirect" (e.g., HTTP 301, 302) to a new URL. The requests library automatically follows these redirects for you.

Python中如何获取requesturl?-图2
(图片来源网络,侵删)

The .url attribute of the Response object shows you the URL of the page you ultimately landed on, not the one you initially asked for.

Why is this useful?

  • Following Redirects: You can verify if a URL redirected you to a different page (e.g., from http to https, or from a short link to its final destination).
  • Debugging: If a request fails, checking the final URL can help you understand why (e.g., you were redirected to a login page or an error page).
  • Scraping: Ensures you are parsing the content from the correct final page.

Step-by-Step Example

First, you need to install the requests library if you haven't already:

pip install requests

Now, let's see it in action. We'll use a URL that is known to redirect.

import requests
# The initial URL we want to request.
# This URL redirects to the official Python website.
initial_url = "http://python.org"
print(f"Initial URL: {initial_url}")
try:
    # Make a GET request. allow_redirects=True is the default.
    response = requests.get(initial_url)
    # Check if the request was successful (status code 200 OK)
    response.raise_for_status()  # This will raise an error for bad status codes (4xx or 5xx)
    # --- The key part ---
    # Get the final URL from the response object
    final_url = response.url
    print(f"Final URL after redirects: {final_url}")
    # Let's prove they are different
    if initial_url != final_url:
        print("\nRedirect occurred!")
        print(f"The initial URL '{initial_url}' redirected to '{final_url}'.")
    else:
        print("\nNo redirect occurred.")
    # You can also access other useful information from the response
    print(f"\nStatus Code: {response.status_code}")
    print(f"Response Headers (Location): {response.headers.get('Location', 'No redirect header found')}")
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

Expected Output:

Initial URL: http://python.org
Final URL after redirects: https://www.python.org/
Redirect occurred!
The initial URL 'http://python.org' redirected to 'https://www.python.org/'.
Status Code: 200
Response Headers (Location): No redirect header found

Notice that the Location header isn't present in the final response because requests has already handled the redirect. The .url attribute is the simplest way to see the final destination.

Python中如何获取requesturl?-图3
(图片来源网络,侵删)

Other Important URL-Related Attributes in requests

The Response object has other attributes that are also very useful for working with URLs:

Attribute Description Example
response.url The final URL of the response, after all redirects. 'https://www.python.org/'
response.history A list of Response objects from the intermediate redirects. [<Response [301]>, <Response [302]>]
response.request.url The URL of the original request you made. 'http://python.org'
response.request.headers The headers sent with the original request. {'User-Agent': '...', ...}

Example using history and request.url:

import requests
url = "http://github.com"
response = requests.get(url)
print(f"Original Request URL: {response.request.url}")
print(f"Final Response URL:   {response.url}")
print("\n--- Redirect History ---")
if response.history:
    for i, resp in enumerate(response.history):
        print(f"{i+1}. Redirected from: {resp.url} (Status: {resp.status_code})")
else:
    print("No redirects occurred.")

Expected Output:

Original Request URL: http://github.com
Final Response URL:   https://github.com/
--- Redirect History ---
1. Redirected from: http://github.com (Status: 301)

How to Construct a URL (The Opposite Direction)

Sometimes, you have a base URL and a path, and you need to construct a full URL. For this, you should use Python's built-in urllib.parse module, not string concatenation, to handle special characters correctly.

from urllib.parse import urljoin
base_url = "https://www.example.com/path/to/page/"
relative_path = "subfolder/resource.html"
query_params = "?key=value&another=123"
# Construct the full URL safely
full_url = urljoin(base_url, relative_path + query_params)
print(f"Base URL:     {base_url}")
print(f"Path to join: {relative_path + query_params}")
print(f"Full URL:     {full_url}")
# Example with a path that goes "up" the directory
another_path = "../other-resource.html"
full_url_2 = urljoin(base_url, another_path)
print(f"\nAnother Full URL: {full_url_2}")

Expected Output:

Base URL:     https://www.example.com/path/to/page/
Path to join: subfolder/resource.html?key=value&another=123
Full URL:     https://www.example.com/path/to/page/subfolder/resource.html?key=value&another=123
Another Full URL: https://www.example.com/path/to/other-resource.html

Summary

Task Tool/Method Example
Get the final URL after a request response.url final_url = requests.get(url).url
See the redirect history response.history for resp in response.history: ...
Get the original URL requested response.request.url original_url = response.request.url
Construct a URL from parts urllib.parse.urljoin full_url = urljoin(base, path)
分享:
扫描分享到社交APP
上一篇
下一篇