杰瑞科技汇

Java serialization如何确保数据安全与兼容性?

Of course! Let's dive deep into Java Serialization. It's a fundamental concept, but it's important to understand its mechanics, use cases, and, most importantly, its pitfalls.

Java serialization如何确保数据安全与兼容性?-图1
(图片来源网络,侵删)

What is Java Serialization?

In simple terms, Java Serialization is the process of converting an object into a byte stream. This byte stream can be stored in a file, sent over a network, or saved in a database.

The reverse process, reconstructing the object from the byte stream, is called Deserialization.

Why is this useful?

  • Persistence: You can save the state of an object to a file and load it back later, even after the program has terminated.
  • Network Communication: Objects can be "passed" over a network by being serialized on the client side, sent as bytes, and deserialized on the server side (or vice-versa). This is the foundation of technologies like RMI (Remote Method Invocation) and is used in many web protocols.

How to Make a Class Serializable?

Making a class serializable is incredibly easy. You just need to implement the java.io.Serializable marker interface.

Java serialization如何确保数据安全与兼容性?-图2
(图片来源网络,侵删)
import java.io.Serializable;
public class User implements Serializable {
    // A serializable class should have a version ID to prevent
    // InvalidClassException during deserialization if the class changes.
    private static final long serialVersionUID = 1L;
    private String username;
    private transient String password; // See explanation below
    private int age;
    // Constructors, Getters, and Setters
    public User(String username, String password, int age) {
        this.username = username;
        this.password = password;
        this.age = age;
    }
    @Override
    public String toString() {
        return "User{" +
                "username='" + username + '\'' +
                ", password='" + password + '\'' + // This will be null after deserialization
                ", age=" + age +
                '}';
    }
}

Key Points:

  • Serializable is a Marker Interface: It has no methods. It simply acts as a flag to the Java Virtual Machine (JVM), signaling that objects of this class can be serialized.
  • serialVersionUID: This is a unique identifier for each serializable class. If you don't explicitly define one, the JVM will generate one for you at compile time based on the class's structure (name, fields, methods, etc.). If you change the class (e.g., add a new field), the generated serialVersionUID will change. When you try to deserialize an old byte stream with the new class, the JVM will see a mismatch and throw an InvalidClassException. By defining it yourself, you gain control over versioning.

The Serialization Process

Let's see how to serialize and deserialize the User object.

import java.io.*;
public class SerializationDemo {
    public static void main(String[] args) {
        // 1. Create an object
        User user = new User("john_doe", "secret123", 30);
        // --- Serialization: Object -> Byte Stream ---
        try (ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("user.ser"))) {
            oos.writeObject(user);
            System.out.println("Object has been serialized to user.ser");
        } catch (IOException e) {
            e.printStackTrace();
        }
        // --- Deserialization: Byte Stream -> Object ---
        User deserializedUser = null;
        try (ObjectInputStream ois = new ObjectInputStream(new FileInputStream("user.ser"))) {
            deserializedUser = (User) ois.readObject();
            System.out.println("Object has been deserialized from user.ser");
        } catch (IOException | ClassNotFoundException e) {
            e.printStackTrace();
        }
        // 2. Print the deserialized object
        if (deserializedUser != null) {
            System.out.println("Deserialized User: " + deserializedUser);
        }
    }
}

Output:

Object has been serialized to user.ser
Object has been deserialized from user.ser
Deserialized User: User{username='john_doe', password='null', age=30}

Notice that the password field is null. This is because it was marked as transient.


Important Keywords and Concepts

transient Keyword

If you don't want a particular field to be serialized, you can mark it with the transient keyword.

  • Use Case: It's commonly used for sensitive data like passwords, or for data that is not meaningful outside the current JVM context, like file handles (FileDescriptor) or thread references (Thread).
  • Behavior: During deserialization, the transient field is not restored from the byte stream. Instead, it is given its default value (null for object references, 0 for numbers, false for booleans).

readObject() and writeObject() Methods

Sometimes, you need custom logic during serialization or deserialization. For example, you might want to encrypt a password before writing it and decrypt it after reading it.

You can provide this custom logic by adding private methods named writeObject and readObject to your class.

private void writeObject(ObjectOutputStream oos) throws IOException {
    // 1. Perform custom logic (e.g., encrypt password)
    String encryptedPassword = "ENCRYPTED_" + this.password;
    // 2. Let the default serialization happen
    oos.defaultWriteObject();
    // 3. Write the custom data (if needed)
    // oos.writeObject(encryptedPassword); // This would be redundant if you modify the field
}
private void readObject(ObjectInputStream ois) throws IOException, ClassNotFoundException {
    // 1. Let the default deserialization happen first
    ois.defaultReadObject();
    // 2. Perform custom logic (e.g., decrypt password)
    if (this.password != null && this.password.startsWith("ENCRYPTED_")) {
        this.password = this.password.replace("ENCRYPTED_", "");
    }
}
  • defaultWriteObject() and defaultReadObject(): These methods handle the default serialization/deserialization of all non-transient and non-static fields, allowing you to focus on the custom parts.

Externalizable Interface

For even more control, you can implement the Externalizable interface instead of Serializable.

  • Serializable (Default): Uses reflection to automatically serialize all non-transient fields. It's easy but can be slow and may not handle complex graphs of objects well.
  • Externalizable (Advanced): You have to write the entire serialization and deserialization logic yourself. This gives you maximum performance and control, but it's more work and error-prone.
import java.io.Externalizable;
import java.io.IOException;
import java.io.ObjectInput;
import java.io.ObjectOutput;
public class Product implements Externalizable {
    private String name;
    private double price;
    // A no-arg constructor is MANDATORY for Externalizable
    public Product() {}
    public Product(String name, double price) {
        this.name = name;
        this.price = price;
    }
    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
        // You decide exactly what to write and in what order
        out.writeUTF(name);
        out.writeDouble(price);
    }
    @Override
    public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
        // You must read back in the EXACT same order
        this.name = in.readUTF();
        this.price = in.readDouble();
    }
    @Override
    public String toString() {
        return "Product{name='" + name + "', price=" + price + "}";
    }
}

Security Concerns (A Major Warning)

Java Serialization is a potential security risk. When you readObject() from an untrusted source, you are essentially executing arbitrary code. Malicious byte streams can be crafted to exploit vulnerabilities in the class libraries' readObject methods, leading to Remote Code Execution (RCE) attacks.

  • Example: The Apache Commons Collections library had a vulnerability where its deserialization logic could be abused to execute commands on the server.
  • Best Practice: Avoid deserializing data from untrusted or unauthenticated sources. If you must, use libraries that provide safer alternatives or validate the stream before deserialization.

Modern Alternatives

Because of its security risks and other limitations (like being tied to the JVM), serialization is often avoided in modern, distributed systems.

JSON (JavaScript Object Notation)

  • Pros: Human-readable, language-agnostic, widely supported. Libraries like Jackson, Gson, and Moshi make it easy to use in Java.
  • Use Case: The de-facto standard for web APIs (REST).

Protocol Buffers (Protobuf) & gRPC

  • Pros: Highly efficient (smaller, faster than JSON), strongly typed, schema-first (.proto file). Developed by Google.
  • Use Case: High-performance microservices, internal APIs.

XML (eXtensible Markup Language)

  • Pros: Human-readable, self-describing schema (XSD).
  • Cons: Verbose, slower to parse than JSON or Protobuf.

Summary: `Serializable

分享:
扫描分享到社交APP
上一篇
下一篇