杰瑞科技汇

python urllencode

Of course! In Python, the standard way to perform URL encoding is by using the urllib.parse module. The most common function for this is quote().

Here's a complete guide covering the basics, advanced use cases, and best practices.

The Basic Function: urllib.parse.quote()

This is the core function you'll use. It takes a string and replaces special characters with a followed by two hexadecimal characters. This ensures the string is safe to include in a URL.

Key Characters Encoded:

  • Space (`) becomes%20`
  • becomes %2F
  • becomes %3F
  • & becomes %26
  • becomes %23
  • becomes %2B

Example 1: Encoding a Simple String

Let's encode a string with spaces and special characters.

import urllib.parse
# A string with spaces and a special character
original_string = "hello world & python!"
# Encode the string
encoded_string = urllib.parse.quote(original_string)
print(f"Original: {original_string}")
print(f"Encoded:  {encoded_string}")
# Expected Output:
# Original: hello world & python!
# Encoded:  hello%20world%20%26%20python%21

As you can see:

  • ` (space) became%20`
  • & became %26
  • became %21

Encoding for Query Parameters: urllib.parse.urlencode()

When you want to build a URL with query parameters (the part after the ), urlencode() is much more convenient. It takes a dictionary of key-value pairs and correctly encodes both the keys and the values.

Example 2: Encoding a Dictionary of Parameters

This is the most common use case for web scraping or making API requests.

import urllib.parse
# A dictionary of query parameters
params = {
    'search': 'python tutorial',
    'category': 'web development',
    'page': 1
}
# Encode the dictionary into a query string
query_string = urllib.parse.urlencode(params)
print(f"Query String: {query_string}")
# Expected Output:
# Query String: search=python%20tutorial&category=web%20development&page=1

You can then easily append this to a base URL:

base_url = "https://www.example.com/search?"
full_url = base_url + query_string
print(f"Full URL: {full_url}")
# Expected Output:
# Full URL: https://www.example.com/search?search=python%20tutorial&category=web%20development&page=1

Advanced Usage of urlencode()

urlencode() has some useful parameters:

  • quote_via: You can specify which quoting function to use. The default is quote, but you can use quote_plus which encodes spaces as instead of %20. This is often preferred for application/x-www-form-urlencoded content (like form data).

    import urllib.parse
    params = {'q': 'hello world', 'sort': 'date'}
    # Default (uses quote, space becomes %20)
    print(urllib.parse.urlencode(params))
    # Output: q=hello%20world&sort=date
    # Using quote_plus (space becomes +)
    print(urllib.parse.urlencode(params, quote_via=urllib.parse.quote_plus))
    # Output: q=hello+world&sort=date
  • safe: You can provide a string of characters that should not be encoded.

    import urllib.parse
    # We want to keep the '/' character unencoded
    path_segment = "/api/v1/users/john doe/"
    encoded_path = urllib.parse.quote(path_segment, safe='/')
    print(f"Original: {path_segment}")
    print(f"Encoded:  {encoded_path}")
    # Expected Output:
    # Original: /api/v1/users/john doe/
    # Encoded:  /api/v1/users/john%20doe/

    Notice the space (%20) was encoded, but the slashes () were preserved.


Decoding URLs: urllib.parse.unquote()

Of course, you'll also need to decode URLs to get the original string back. The function for that is unquote().

Example 3: Decoding an Encoded String

import urllib.parse
# An encoded string (e.g., from a URL)
encoded_url = "https://www.example.com/search?q=python%20tutorial%26tips"
# Decode the string
decoded_url = urllib.parse.unquote(encoded_url)
print(f"Encoded URL:   {encoded_url}")
print(f"Decoded URL:   {decoded_url}")
# Expected Output:
# Encoded URL:   https://www.example.com/search?q=python%20tutorial%26tips
# Decoded URL:   https://www.example.com/search?q=python tutorial%26tips
# Let's decode the query parameter part separately
query_param = "python%20tutorial%26tips"
decoded_param = urllib.parse.unquote(query_param)
print(f"Decoded Param: {decoded_param}")
# Expected Output:
# Decoded Param: python tutorial&tips

Complete Example: Building a Full URL

Here's a practical example of building a complete, valid URL with encoded parameters.

import urllib.parse
# 1. Define the base URL and parameters
base_url = "https://api.example.com/data"
search_params = {
    'query': 'python & "web scraping"',
    'limit': 50,
    'filter': 'public'
}
# 2. Encode the parameters into a query string
# The '&' in the query value is automatically encoded to '%26'
encoded_query = urllib.parse.urlencode(search_params)
# 3. Combine the base URL and the encoded query string
# It's good practice to check if the base_url already has a query string
# to avoid duplicate '?' characters.
full_url = f"{base_url}?{encoded_query}"
print("--- URL Construction Example ---")
print(f"Base URL:    {base_url}")
print(f"Parameters:  {search_params}")
print(f"Full URL:    {full_url}")
# 4. Now, let's decode it back to verify
print("\n--- Decoding Example ---")
decoded_query = urllib.parse.unquote(full_url.split('?')[1])
print(f"Decoded Query: {decoded_query}")

Summary of Functions

Function Purpose Example
urllib.parse.quote(string) Encodes a string for use in a URL path or component. quote("a/b") -> "a%2Fb"
urllib.parse.quote_plus(string) Encodes a string, converting spaces to . Best for form data. quote_plus("a b") -> "a+b"
urllib.parse.urlencode(dict) Encodes a dictionary of parameters into a query string (key=value&...). urlencode({'a':1, 'b':'c d'}) -> "a=1&b=c+d"
urllib.parse.unquote(string) Decodes a URL-encoded string back to its original form. unquote("a%20b") -> "a b"

Best Practices

  1. Use urlencode for Query Strings: When building URLs with parameters, always prefer urlencode() over manually calling quote() on each part. It's cleaner, less error-prone, and handles the & and separators correctly.
  2. Encode Only What You Need: The quote() function's safe parameter is very useful. For example, when encoding a path segment, you should keep unencoded.
  3. Decode Before Using: If you receive a URL from an external source (like a web page or API response), always decode it before trying to parse or display it to ensure you're working with the correct characters.
  4. Python 3: All these functions are in the standard urllib.parse module. There is no urllib2 in Python 3; its functionality was merged into urllib.
分享:
扫描分享到社交APP
上一篇
下一篇