杰瑞科技汇

Python Selenium API 如何快速上手?

Of course! Here is a comprehensive guide to the Python Selenium API, designed to take you from the basics to more advanced concepts.

Python Selenium API 如何快速上手?-图1
(图片来源网络,侵删)

What is Selenium?

Selenium is a powerful, open-source automation framework used primarily for automating web browsers. With its Python bindings, you can write scripts to control a web browser, interact with web pages, and perform actions just like a human user would. It's widely used for:

  • Web Scraping: Extracting data from websites that are difficult to scrape with simple requests libraries (e.g., sites that load content dynamically with JavaScript).
  • Automated Testing: The primary use case. Automating functional, regression, and end-to-end testing of web applications.
  • Browser Automation: Automating repetitive tasks like filling out forms, logging into sites, or generating reports.

Setup and Installation

Before you can use the API, you need to set up your environment.

Install the Selenium Library: Open your terminal or command prompt and run:

pip install selenium

Install a WebDriver: Selenium needs a special program called a WebDriver to act as a bridge between your script and the web browser. You need one for each browser you want to automate.

Python Selenium API 如何快速上手?-图2
(图片来源网络,侵删)
  • Chrome: Download ChromeDriver. Make sure the version of ChromeDriver matches your Google Chrome browser version.
  • Firefox: Download GeckoDriver.
  • Edge: Download EdgeDriver.

Pro Tip: Use a tool like webdriver-manager to automate driver management. It automatically downloads and manages the correct driver for you.

pip install webdriver-manager

Core Concepts and API Structure

The Selenium API revolves around a few key classes and objects.

WebDriver

This is the heart of Selenium. It represents an instance of a browser that you can control.

WebElement

Represents a single HTML element on a page (e.g., a button, a link, an input field). You get these objects by finding elements on the page and then perform actions on them (click, send keys, etc.).

Python Selenium API 如何快速上手?-图3
(图片来源网络,侵删)

By

A class containing locator strategies used to find elements on the page (e.g., By.ID, By.CSS_SELECTOR, By.XPATH).

WebDriverWait & expected_conditions

Used for implementing Explicit Waits, which are crucial for handling dynamic content. They pause the script until a certain condition is met (e.g., an element is visible, clickable).


A Complete Walkthrough: From Start to Finish

Let's build a simple script that automates a Google search.

Step 1: Import Libraries and Set Up the WebDriver

We'll use the webdriver-manager to handle the ChromeDriver automatically.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
# Set up the WebDriver
# ChromeDriverManager will automatically download and manage the driver
service = ChromeService(executable_path=ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)
print("WebDriver setup complete.")

Step 2: Basic Navigation

# Open a URL
driver.get("https://www.google.com")
# Get the page title and print it
print(f"Page Title: {driver.title}")
# Maximize the browser window
driver.maximize_window()

Step 3: Finding and Interacting with Elements (The Core API)

This is the most important part. We'll find the search box, type a query, and press "Enter".

# --- Finding Elements ---
# There are two main methods:
# find_element() -> Returns a single WebElement (throws an exception if not found)
# find_elements() -> Returns a list of WebElements (returns an empty list if not found)
# Find the search box by its NAME attribute (often the easiest)
search_box = driver.find_element(By.NAME, "q")
# --- Interacting with Elements ---
# .send_keys() types text into an input field
search_box.send_keys("Selenium Python API")
# .send_keys() can also send special keys like Enter, Tab, etc.
search_box.send_keys(Keys.RETURN)
print("Search performed.")

Step 4: Waiting for Results and Extracting Data

Modern websites load content dynamically. Using a simple time.sleep() is bad practice. Explicit Waits are the correct solution.

# --- Using Explicit Waits ---
# Wait up to 10 seconds for the first search result link to be clickable
try:
    first_result = WebDriverWait(driver, 10).until(
        EC.element_to_be_clickable((By.CSS_SELECTOR, "div.g a h3"))
    )
    print("First search result found!")
    # .text gets the visible text of the element
    print(f"Result text: {first_result.text}")
    # .get_attribute() can get any attribute, like the href
    print(f"Result link: {first_result.get_attribute('href')}")
except Exception as e:
    print(f"An error occurred: {e}")
    print("Could not find the search result.")

Step 5: Cleaning Up

This is a critical step to ensure the browser process is closed.

# --- Closing the Browser ---
# .close() closes the current tab or window
# .quit() closes the entire browser session and ends the driver process
time.sleep(3) # Keep browser open for 3 seconds to see the result
driver.quit()
print("Browser closed. Script finished.")

Key API Methods and Properties

Here's a quick reference for the most commonly used parts of the API.

WebDriver Methods

Method Description
get(url) Navigates the browser to a given URL.
find_element(By, value) Finds the first element matching the locator.
find_elements(By, value) Finds all elements matching the locator (returns a list).
current_url Property to get the URL of the current page.
page_source Property to get the page's HTML source code.
back(), forward(), refresh() Navigates browser history or refreshes the page.
maximize_window(), minimize_window() Resizes the browser window.
close() Closes the current tab or window.
quit() Closes the browser and terminates the WebDriver session.

WebElement Methods

Method Description
click() Simulates a mouse click on the element.
send_keys(text) Simulates typing text into an input field.
clear() Clears the text from an input field.
text Property to get the visible text of the element.
get_attribute(name) Gets the value of a given attribute (e.g., 'href', 'id', 'class').
is_displayed() Returns True if the element is visible to the user.
is_enabled() Returns True if the element is enabled.
is_selected() Returns True if the element (e.g., checkbox, radio button) is selected.
screenshot(filename) Takes a screenshot of the element and saves it to a file.

Locators (By Class)

Locator Strategy Description Example
By.ID Finds an element by its id attribute. driver.find_element(By.ID, "username")
By.NAME Finds an element by its name attribute. driver.find_element(By.NAME, "q")
By.CLASS_NAME Finds an element by its class attribute. driver.find_element(By.CLASS_NAME, "search-btn")
By.TAG_NAME Finds an element by its HTML tag name. driver.find_element(By.TAG_NAME, "h1")
By.CSS_SELECTOR Finds an element using a CSS selector. (Most Powerful) driver.find_element(By.CSS_SELECTOR, "div#content > p.intro")
By.XPATH Finds an element using an XPath expression. (Very Powerful) driver.find_element(By.XPATH, "//div[@id='header']//a[text()='Home']")

Handling Common Scenarios

Handling Dropdowns (<select>)

Use the Select class from selenium.webdriver.support.ui.

from selenium.webdriver.support.ui import Select
# ... find the select element
select_element = driver.find_element(By.ID, "country-select")
select = Select(select_element)
# You can select by visible text, value, or index
select.select_by_visible_text("Canada")
# select.select_by_value("ca")
# select.select_by_index(0) # Selects the first option

Handling Alerts and Pop-ups

# Trigger an alert (e.g., by clicking a button)
driver.find_element(By.ID, "alert-button").click()
# Switch to the alert
alert = driver.switch_to.alert
# Get the alert text
print(alert.text)
# Accept the alert (clicks OK)
# alert.accept()
# Dismiss the alert (clicks Cancel)
# alert.dismiss()

Handling iFrames

iFrames are separate HTML documents embedded within a page. You must switch to the frame before you can interact with its elements.

# Find the iframe by its id, name, or WebElement
iframe = driver.find_element(By.ID, "my-iframe")
# Switch to the iframe
driver.switch_to.frame(iframe)
# Now you can find elements inside the iframe
button_in_frame = driver.find_element(By.ID, "frame-button")
button_in_frame.click()
# Switch back to the main page (important!)
driver.switch_to.default_content()

Handling Dynamic Content with Waits

  • Implicit Wait: Tells the WebDriver to poll the DOM for a certain amount of time when trying to find an element. It's set once and applies to all find_element calls.

    driver.implicitly_wait(10) # Wait up to 10 seconds for any element to be found
    # Can lead to flaky tests if not used carefully
  • Explicit Wait: The recommended approach. Waits for a specific condition to be met before proceeding. It's more reliable and precise.

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    # Wait up to 10 seconds until the element is clickable
    element = WebDriverWait(driver, 10).until(
        EC.element_to_be_clickable((By.ID, "submit-button"))
    )
    element.click()

Best Practices

  1. Use Explicit Waits: Avoid time.sleep(). It makes your tests slow and unreliable. Use WebDriverWait for dynamic content.
  2. Use Unique Selectors: Prefer ID and Name when available. If not, use robust CSS_SELECTOR or XPATH. Avoid brittle selectors like nth-child().
  3. Structure Your Code: Use the Page Object Model (POM) design pattern to organize your locators and methods into separate classes, making your code cleaner and more maintainable.
  4. Clean Up: Always use a try...finally block or ensure your driver.quit() is called to prevent zombie browser processes.
  5. Headless Mode: For automated testing on a server, run the browser in "headless" mode (no GUI).
    from selenium.webdriver.chrome.options import Options
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    driver = webdriver.Chrome(service=service, options=chrome_options)
分享:
扫描分享到社交APP
上一篇
下一篇