Of course! When you ask about the "status" in Python's urllib library, you're almost certainly referring to the HTTP status code. This is a three-digit number returned by a web server that indicates the result of an HTTP request (e.g., 200 for OK, 404 for Not Found).

Here's a complete guide on how to get and use HTTP status codes with urllib, broken down by the library's modules.
The Core Concept: The status Attribute
When you make a request with urllib, the response object you get back has a status attribute. This attribute holds the integer HTTP status code.
Using urllib.request (The Modern, Recommended Approach)
This is the standard way to make HTTP requests in Python 3. You use urllib.request.urlopen() to get a response object, and then you check its .status attribute.
Basic Example: Getting the Status Code
This example demonstrates the most common use case: fetching a URL and checking if the request was successful.

import urllib.request
import urllib.error
# The URL we want to check
url = "https://www.example.com"
try:
# urlopen returns a response object
with urllib.request.urlopen(url) as response:
# The status code is an attribute of the response object
status_code = response.status
print(f"URL: {url}")
print(f"Status Code: {status_code}")
print(f"Reason: {response.reason}") # e.g., 'OK'
print(f"Headers: {response.headers}")
except urllib.error.HTTPError as e:
# This block catches HTTP errors (like 404, 500, etc.)
print(f"Error: The server could not fulfill the request.")
print(f"Error Code: {e.code}")
print(f"Error Reason: {e.reason}")
# You can also get headers from the error response
print(f"Error Headers: {e.headers}")
except urllib.error.URLError as e:
# This block catches other URL-related errors (like no internet, DNS failure)
print(f"Error: Failed to reach a server.")
print(f"Reason: {e.reason}")
except Exception as e:
print(f"An unexpected error occurred: {e}")
Output for https://www.example.com:
URL: https://www.example.com
Status Code: 200
Reason: OK
Headers: Content-Type: text/html; charset=UTF-8
Server: ECS (dcb/7F83)
...
Output for a non-existent URL (e.g., https://www.example.com/nonexistent-page):
Error: The server could not fulfill the request.
Error Code: 404
Error Reason: Not Found
Error Headers: ...
Key Components Explained:
urllib.request.urlopen(url): Opens the URL and returns a file-like object (http.client.HTTPResponse).response.status: The integer status code (e.g.,200).response.reason: The human-readable reason phrase for the status code (e.g.,'OK','Not Found').response.headers: A dictionary-like object containing the response headers.try...exceptblock: This is crucial for robust code. If the server returns an error status code (like 404 or 500),urlopenraises anHTTPError. If there's a network problem (like no connection), it raises aURLError.
Using urllib.error
You don't use urllib.error directly to make requests, but you use it to handle the exceptions that urllib.request can raise.
urllib.error.HTTPError: Raised when the server returns an HTTP error code (4xx or 5xx). As shown above, this exception object has its own.code,.reason, and.headersattributes, which are very useful.urllib.error.URLError: Raised for more general, non-HTTP-related errors, like a network being down or an invalid URL.
Using urllib.parse (For Building URLs)
While not directly related to the status of a response, urllib.parse is often used alongside urllib.request to construct valid URLs, especially when adding query parameters.

from urllib.parse import urlencode
from urllib.request import urlopen
base_url = "https://httpbin.org/get"
params = {'key1': 'value1', 'key2': 'value2'}
# Encode the parameters into a query string
# 'key1=value1&key2=value2'
query_string = urlencode(params)
# Combine the base URL and the query string
full_url = f"{base_url}?{query_string}"
print(f"Requesting URL: {full_url}")
with urlopen(full_url) as response:
print(f"Status Code: {response.status}")
# The response body will show the URL and params the server received
print(response.read().decode('utf-8'))
Common HTTP Status Codes
Here's a quick reference for the status codes you'll encounter most often:
| Code | Name | Meaning |
|---|---|---|
| 200 | OK | The request was successful. This is what you want to see. |
| 301 | Moved Permanently | The resource has a new, permanent URL. The browser will usually follow this automatically. |
| 302 | Found | The resource has a temporary new URL. Similar to 301. |
| 401 | Unauthorized | Authentication is required. You need to log in. |
| 403 | Forbidden | You do not have permission to access the resource, even if you are authenticated. |
| 404 | Not Found | The resource you requested does not exist on the server. |
| 500 | Internal Server Error | A generic server error. Something went wrong on the server side. |
Summary and Best Practices
- Use
urllib.requestfor making requests. - Always use a
try...exceptblock to handleHTTPErrorandURLError. - Check
response.statusinside thetryblock for successful requests (typically status200). - Check
e.codeinside theexcept urllib.error.HTTPError as e:block for error codes. - For complex applications, consider using the more powerful and user-friendly
requestslibrary, which simplifies many of these tasks. However,urllibis great for simple scripts and is part of Python's standard library.
