杰瑞科技汇

Java java.net.URL如何正确使用与解析?

Of course! Let's dive deep into java.net.URL. This is a fundamental class in Java for working with resources identified by Uniform Resource Locators (URLs).

Java java.net.URL如何正确使用与解析?-图1
(图片来源网络,侵删)

What is java.net.URL?

In simple terms, a URL object in Java is a pointer or a representation of a resource on the internet (or a local network). It doesn't contain the resource itself; it contains all the information needed to locate it.

Think of it like the address on an envelope:

  • Protocol: http:// (like the "Via Air Mail" or "Express Delivery" instruction)
  • Host: www.example.com (the destination city and street)
  • Port: 80 (the specific building number, often default)
  • Path: /docs/index.html (the apartment number and door within the building)
  • Query: ?user=alex&sort=date (instructions for the person at the door, like "Give me files for user Alex, sorted by date")
  • Ref: #section2 (the specific paragraph to read on that page)

The java.net.URL class encapsulates all these parts into a single, manageable object.


Creating a URL Object

There are several ways to create a URL object.

Java java.net.URL如何正确使用与解析?-图2
(图片来源网络,侵删)

a) Using the Full Constructor

This is the most explicit way, where you provide all the components.

import java.net.MalformedURLException;
import java.net.URL;
public class URLExample {
    public static void main(String[] args) {
        try {
            URL url = new URL("http", "www.example.com", 80, "/index.html");
            System.out.println(url);
        } catch (MalformedURLException e) {
            System.err.println("Invalid URL: " + e.getMessage());
        }
    }
}
// Output: http://www.example.com:80/index.html

b) Using the String Constructor (Most Common)

This is the simplest and most frequent way. You pass a single, well-formed String.

import java.net.MalformedURLException;
import java.net.URL;
public class URLExample {
    public static void main(String[] args) {
        try {
            URL url = new URL("https://www.google.com/search?q=java+url&oq=java+url");
            System.out.println(url);
        } catch (MalformedURLException e) {
            System.err.println("Invalid URL: " + e.getMessage());
        }
    }
}
// Output: https://www.google.com/search?q=java+url&oq=java+url

Note: The URL constructor can throw a MalformedURLException. This is a checked exception, so you must handle it with a try-catch block or declare it in your method's throws clause.


Key Components of a URL (Getters)

The URL class provides convenient getter methods to access the different parts of the URL.

Java java.net.URL如何正确使用与解析?-图3
(图片来源网络,侵删)
import java.net.MalformedURLException;
import java.net.URL;
public class URLComponents {
    public static void main(String[] args) {
        String urlString = "https://user:pass@www.example.com:8080/docs/tutorial.html?name=networking#DOWNLOAD";
        try {
            URL url = new URL(urlString);
            System.out.println("Protocol: " + url.getProtocol());     // https
            System.out.println("Host:     " + url.getHost());         // www.example.com
            System.out.println("Port:     " + url.getPort());         // 8080
            System.out.println("Default Port: " + url.getDefaultPort()); // 443 for https
            System.out.println("Path:     " + url.getPath());         // /docs/tutorial.html
            System.out.println("Query:    " + url.getQuery());        // name=networking
            System.out.println("Ref (Fragment): " + url.getRef());    // DOWNLOAD
            System.out.println("Authority: " + url.getAuthority());   // user:pass@www.example.com:8080
            System.out.println("File:     " + url.getFile());         // /docs/tutorial.html?name=networking
        } catch (MalformedURLException e) {
            e.printStackTrace();
        }
    }
}

Important Notes on Ports:

  • url.getPort() returns the port specified in the URL.
  • If no port is specified, it returns -1.
  • url.getDefaultPort() returns the default port for the protocol (e.g., 80 for http, 443 for https).

The openConnection() Method: The Heart of URL

The most important method of the URL class is openConnection(). It establishes a connection to the resource and returns a URLConnection object, which is a generic interface for accessing the resource.

From this URLConnection, you can:

  • Read data from the resource (e.g., a web page, an image).
  • Write data to the resource (e.g., submitting a form via POST).
  • Get metadata about the resource (e.g., content type, content length, last modified date).

Example: Reading Text from a URL

This classic example demonstrates how to fetch the HTML content of a webpage.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
public class ReadFromURL {
    public static void main(String[] args) {
        // Use a reliable public API for this example
        String urlString = "https://api.publicapis.org/random";
        try {
            // 1. Create a URL object
            URL url = new URL(urlString);
            // 2. Open a connection and get an input stream
            // We use try-with-resources to automatically close the reader
            try (BufferedReader reader = new BufferedReader(
                    new InputStreamReader(url.openStream()))) {
                String line;
                // 3. Read the content line by line
                while ((line = reader.readLine()) != null) {
                    // For brevity, we'll only print the first 200 characters
                    System.out.println(line.length() > 200 ? line.substring(0, 200) + "..." : line);
                }
            }
        } catch (IOException e) {
            System.err.println("An error occurred while reading the URL: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

URL vs. URI

This is a very common point of confusion. They are related but serve different purposes.

Feature java.net.URL java.net.URI
Purpose Locator: To locate and access a resource. Identifier: To uniquely identify a resource.
Functionality Can be "resolved" into a URLConnection to access the data. Cannot be used directly to access data. It's purely a string representation.
Syntax Strictly follows the RFC for URLs. More permissive with syntax. Stricter syntax. Must be properly escaped (e.g., spaces become %20).
State Represents a parsed URL with mutable components (via getters). Represents an immutable identifier.
When to Use When you need to actually connect to and read from/write to a network resource. When you just need to parse, compare, or construct identifiers (e.g., in a configuration file, a database, or as a parameter in an application). It's safer for handling paths.

Best Practice: Use URI for all internal application logic, parsing, and construction of identifiers. If you need to access the resource, convert the URI to a URL just before opening the connection.

import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
import java.net.MalformedURLException;
public class UriToUrlExample {
    public static void main(String[] args) {
        String pathString = "/path/with spaces/file.html";
        try {
            // 1. Use URI for safe construction and handling of paths
            URI uri = new URI("https", "www.example.com", pathString, null);
            // 2. Convert to URL just before you need to open a connection
            URL url = uri.toURL();
            System.out.println("URI: " + uri);
            System.out.println("URL: " + url);
            System.out.println("Encoded Path: " + uri.getPath()); // Correctly handles spaces
        } catch (URISyntaxException | MalformedURLException e) {
            e.printStackTrace();
        }
    }
}

Deprecation and Modern Alternatives

The java.net.URL class, while powerful, has some limitations and has been largely superseded by the Java HTTP Client (introduced in Java 9 and improved in Java 11) for most use cases.

Why the Modern HTTP Client is Better:

  • Asynchronous Support: It can make non-blocking asynchronous requests, which is crucial for building responsive, high-performance applications.
  • HTTP/2 Support: It has first-class support for the modern HTTP/2 protocol, which allows for multiplexing and better performance.
  • Fluent API: The API is more modern and easier to use.
  • Better Handling of Headers and Body: It provides a cleaner way to work with request and response headers and bodies.

Example: Modern HTTP Client (Java 11+)

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.time.Duration;
public class ModernHttpClientExample {
    public static void main(String[] args) {
        HttpClient client = HttpClient.newBuilder()
                .version(HttpClient.Version.HTTP_2)
                .connectTimeout(Duration.ofSeconds(10))
                .build();
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create("https://api.publicapis.org/random"))
                .header("Accept", "application/json")
                .GET()
                .build();
        try {
            HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
            System.out.println("Status Code: " + response.statusCode());
            System.out.println("Response Body:");
            System.out.println(response.body());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Summary

Concept Description
java.net.URL A class representing a Uniform Resource Locator. It's a pointer to a resource.
Creation Primarily via new URL(String) or new URL(protocol, host, port, path). Throws MalformedURLException.
Access Use url.openConnection() to get a URLConnection for reading/writing data.
Components Use getters like getProtocol(), getHost(), getPath(), getQuery(), getRef().
URL vs. URI Use URI for identifying resources safely. Use URL when you need to access them.
Modern Way For new applications, prefer the Java HTTP Client (Java 11+) for its features like async support and HTTP/2.
分享:
扫描分享到社交APP
上一篇
下一篇