Of course! Let's dive into reading data with Python's httplib module.

First, a very important point: httplib was renamed to http.client in Python 3. The functionality is identical, but the name changed to follow a more consistent naming convention. The examples below will use http.client, but you can substitute httplib if you are using Python 2.
The core idea of reading data from an HTTP response involves a two-step process:
- Send a Request: You create a connection to a server and send an HTTP request (e.g., a GET request for a webpage).
- Read the Response: The server sends back an HTTP response. This response has two main parts:
- Headers: Metadata about the response (e.g., content type, content length, status code).
- Body: The actual data you requested (e.g., the HTML of a page, JSON data, an image file).
You read the body, which is typically a stream of data.
Key Methods for Reading the Response Body
Once you have a response object from a call like conn.getresponse(), you can use several methods to read its body:

| Method | Description | When to Use |
|---|---|---|
response.read() |
Reads the entire response body into memory and returns it as a single bytes object. |
Simple, small responses like JSON or a short HTML page. Be careful with large files! |
response.read(size) |
Reads and returns the next size bytes from the response body. |
For streaming large files (images, videos, large logs) to avoid loading everything into RAM at once. |
response.readline() |
Reads one line from the response body (until it finds a newline character \n). |
Useful when the body is structured as lines of text, like in some streaming APIs or log files. |
response.readlines() |
Reads all remaining lines from the response body and returns them as a list of bytes objects. |
Similar to read(), but for line-based data. Use with caution for large bodies. |
Example 1: Simple GET Request (Reading HTML)
This is the most common scenario. We'll fetch the HTML from httpbin.org, a fantastic service for testing HTTP requests.
import http.client
import ssl
# The host and the path of the resource you want to access
host = 'httpbin.org'
path = '/html'
# --- Step 1: Create a connection and send a request ---
# We use a 'with' statement to ensure the connection is automatically closed
try:
# For modern HTTPS connections (recommended)
# context = ssl.create_default_context()
# conn = http.client.HTTPSConnection(host, context=context)
# For this specific example, httpbin.org supports plain HTTP too
conn = http.client.HTTPConnection(host)
# Send a GET request for the specified path
conn.request("GET", path)
# --- Step 2: Get the response from the server ---
response = conn.getresponse()
# Check if the request was successful (status code 200)
if response.status == 200:
print(f"Status: {response.status} {response.reason}")
print("Headers:")
for header, value in response.getheaders():
print(f" {header}: {value}")
print("-" * 20)
# --- Step 3: Read the response body ---
# response.read() returns the entire body as bytes
body_bytes = response.read()
# Since this is HTML, it's often best to decode it to a string
# The encoding is usually specified in the 'Content-Type' header
# We'll use a common default if not found.
content_type = response.getheader('Content-Type')
if 'charset=' in content_type:
encoding = content_type.split('charset=')[-1]
else:
encoding = 'utf-8' # A safe default
body_str = body_bytes.decode(encoding)
print("First 200 characters of the body:")
print(body_str[:200])
else:
print(f"Error: {response.status} {response.reason}")
except http.client.HTTPException as e:
print(f"HTTP Error: {e}")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# --- Step 4: Close the connection ---
if 'conn' in locals() and conn:
conn.close()
Output:
Status: 200 OK
Headers:
Date: Wed, 27 Sep 2025 10:30:00 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 1365
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
--------------------
First 200 characters of the body:
<!DOCTYPE html>
<html>
<head>httpbin.org</title>
<link href="http://twitter.github.io/bootstrap/assets/css/bootstrap.css" rel="stylesheet"
type="text/css">
...
Example 2: Reading a Large File (Streaming with read(size))
Downloading a large file (e.g., a 100MB image) into memory with response.read() would crash your program. Instead, you should read it in chunks.
import http.client
import ssl
host = 'httpbin.org'
path = '/image/jpeg' # This endpoint returns a sample JPEG image
CHUNK_SIZE = 4096 # Read 4KB at a time
try:
# Using HTTPS for a secure connection
context = ssl.create_default_context()
conn = http.client.HTTPSConnection(host, context=context)
conn.request("GET", path)
response = conn.getresponse()
if response.status == 200:
print(f"Downloading image from {host}{path}...")
# Get the filename from the Content-Disposition header if available
content_disposition = response.getheader('Content-Disposition')
filename = 'downloaded_image.jpg'
if content_disposition and 'filename=' in content_disposition:
filename = content_disposition.split('filename=')[-1].strip('"')
print(f"Saving to file: {filename}")
# Open a file in binary write mode
with open(filename, 'wb') as f:
# Read the response in chunks
while True:
chunk = response.read(CHUNK_SIZE)
if not chunk: # An empty chunk means the end of the stream
break
f.write(chunk) # Write the chunk to the file
print("Download complete!")
else:
print(f"Error: {response.status} {response.reason}")
except Exception as e:
print(f"An error occurred: {e}")
finally:
if 'conn' in locals() and conn:
conn.close()
This example is much more memory-efficient because it never holds the entire file in memory at once. It reads a small piece, writes it to disk, and repeats until the file is complete.
Important Considerations
- Character Encodings: As seen in Example 1,
response.read()returnsbytes. You must decode it into a string (str) using the correct character encoding (e.g.,utf-8,iso-8859-1). Check theContent-Typeheader for thecharsetparameter. - Connection Management: Always close your connection when you're done. The
withstatement is the best way to handle this, as it guarantees the__exit__method (which closes the connection) is called, even if errors occur. - HTTPS: For any real-world application, you should be using HTTPS. The
ssl.create_default_context()is the modern, secure way to handle SSL/TLS verification. - Modern Alternatives: While
http.clientis powerful and built-in, for most applications, the higher-levelrequestslibrary is strongly recommended. It abstracts away many of these low-level details (connection management, encoding, chunked transfer) and provides a much more user-friendly API.
Example using the requests library (for comparison):
import requests
url = 'https://httpbin.org/html'
try:
response = requests.get(url)
response.raise_for_status() # Raises an exception for bad status codes (4xx or 5xx)
# `response.text` automatically handles decoding to a string
print(f"Status: {response.status_code}")
print("First 200 characters of the body:")
print(response.text[:200])
# For binary data, use `response.content` (which is bytes)
# and save it directly to a file
image_url = 'https://httpbin.org/image/jpeg'
img_response = requests.get(image_url, stream=True) # stream=True is key for large files
with open('requests_image.jpg', 'wb') as f:
for chunk in img_response.iter_content(chunk_size=8192):
f.write(chunk)
print("Downloaded image using requests library.")
except requests.exceptions.RequestException as e:
print(f"An error occurred with the requests library: {e}")
