杰瑞科技汇

Java DataInputStream如何高效读取二进制数据?

Of course! Let's dive deep into java.io.DataInputStream. This is a classic and fundamental class in Java's I/O library.

Java DataInputStream如何高效读取二进制数据?-图1
(图片来源网络,侵删)

What is DataInputStream?

DataInputStream is an input stream filter that lets you read Java primitive data types (like int, float, double, boolean) and Strings from an underlying input stream in a machine-independent way.

The key phrase here is machine-independent. This means you can write data on one machine (e.g., a Windows PC using an Intel processor) and read it back on another machine (e.g., a Linux server using an ARM processor) without worrying about differences in how different architectures store numbers (like endianness).


Core Concepts

Machine-Independence (The Big Idea)

Different computer architectures store multi-byte numbers (like int or long) in different orders:

  • Big-Endian: The most significant byte is stored at the lowest memory address (like PowerPC, SPARC).
  • Little-Endian: The least significant byte is stored at the lowest memory address (like Intel x86/x64).

DataInputStream solves this by always reading data in a standardized network byte order (Big-Endian). This ensures that a 100 written on an Intel machine will be read as 100 on a PowerPC machine.

Java DataInputStream如何高效读取二进制数据?-图2
(图片来源网络,侵删)

It's a Decorator (Filter Stream)

DataInputStream doesn't read from a file or network socket directly. It wraps another InputStream. This is a common pattern in Java I/O known as the Decorator Pattern.

+---------------------+
|  DataInputStream    |  <--- You use this one
+---------------------+
          ^
          | wraps
+---------------------+
|  FileInputStream     |  <--- Reads bytes from a file
+---------------------+
          ^
          | wraps
+---------------------+
|   FileDescriptor    |  <--- Represents the actual file
+---------------------+

You create a DataInputStream by passing an existing InputStream to its constructor.

Methods for Primitive Types

DataInputStream provides a specific method for reading each Java primitive type, prefixed with read:

  • readBoolean()
  • readByte()
  • readShort()
  • readChar()
  • readInt()
  • readLong()
  • readFloat()
  • readDouble()
  • readUTF() (for reading a String encoded in UTF-8 format)

Each of these methods reads the appropriate number of bytes from the underlying stream and converts them into a Java primitive type.

Java DataInputStream如何高效读取二进制数据?-图3
(图片来源网络,侵删)

How to Use DataInputStream: A Complete Example

Let's create a simple program that writes some primitive data to a file and then reads it back using DataOutputStream and DataInputStream.

Step 1: Writing the Data (DataOutputStream)

First, we need to write the data. The companion to DataInputStream is DataOutputStream.

import java.io.*;
public class DataWriter {
    public static void main(String[] args) {
        // The file to write to
        String fileName = "data.bin";
        // Use try-with-resources to ensure the stream is closed automatically
        try (DataOutputStream dos = new DataOutputStream(new FileOutputStream(fileName))) {
            System.out.println("Writing data to " + fileName);
            // Write various primitive data types
            dos.writeInt(123456);                  // An integer
            dos.writeDouble(987.654);              // A double
            dos.writeBoolean(true);                // A boolean
            dos.writeUTF("Hello, DataInputStream!"); // A String
            System.out.println("Data written successfully.");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

After running this, a file named data.bin will be created. If you open it in a text editor, it will look like garbled text because it contains binary data, not characters.

Step 2: Reading the Data (DataInputStream)

Now, let's read that data back. Crucially, you must read the data in the exact same order it was written.

import java.io.*;
public class DataReader {
    public static void main(String[] args) {
        String fileName = "data.bin";
        // Use try-with-resources
        try (DataInputStream dis = new DataInputStream(new FileInputStream(fileName))) {
            System.out.println("Reading data from " + fileName);
            // Read data in the SAME order it was written!
            int intValue = dis.readInt();
            double doubleValue = dis.readDouble();
            boolean boolValue = dis.readBoolean();
            String stringValue = dis.readUTF();
            System.out.println("Read Integer: " + intValue);
            System.out.println("Read Double: " + doubleValue);
            System.out.println("Read Boolean: " + boolValue);
            System.out.println("Read String: " + stringValue);
        } catch (EOFException e) {
            // This exception is thrown when the end of the file is reached unexpectedly.
            System.err.println("Reached end of file.");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Expected Output:

Reading data from data.bin
Read Integer: 123456
Read Double: 987.654
Read Boolean: true
Read String: Hello, DataInputStream!

Key Methods and Best Practices

Method Description Important Notes
readBoolean() Reads one byte and returns true if it's non-zero, false otherwise.
readByte(), readShort(), readInt(), readLong() Reads 1, 2, 4, or 8 bytes respectively and converts them to a primitive type. Order is critical! Must match write...() calls.
readFloat(), readDouble() Reads 4 or 8 bytes and converts them to a float or double. Uses the IEEE 754 standard for floating-point numbers.
readChar() Reads two bytes and returns a char. Reads in Big-Endian order.
readUTF() Reads a string that was written by DataOutputStream.writeUTF(). This is not the same as new BufferedReader(...).readLine(). It uses a modified UTF-8 format and has a length prefix.
readFully(byte[] b) Reads bytes from the stream until the entire byte array b is filled. Useful for reading raw byte data of a known size.
skipBytes(int n) Attempts to skip over n bytes of data. Not always precise; readFully() is often better.

Best Practices

  1. Use try-with-resources: Always wrap DataInputStream (and the underlying stream it wraps) in a try-with-resources block. This guarantees that the streams are closed, preventing resource leaks.

    // Good
    try (DataInputStream dis = new DataInputStream(new FileInputStream("file.dat"))) {
        // ... read data ...
    }
    // Bad - resource leak is possible
    DataInputStream dis = new DataInputStream(new FileInputStream("file.dat"));
    // ... if an exception occurs here, the stream might not close ...
    dis.close();
  2. Match Read and Write Order: This is the most common mistake. If you write an int, then a String, you must read an int, then a String. If you try to read a String when an int is next in the stream, you will get incorrect data or a runtime exception.

  3. Handle EOFException: If you try to read past the end of the stream, DataInputStream throws an EOFException (End Of File Exception). You should catch this to know when you've finished reading all the data.


DataInputStream vs. BufferedReader

It's a common point of confusion. When should you use which?

Feature DataInputStream BufferedReader
Purpose Reading binary data (primitives, Strings in UTF-8). Reading text data (lines of characters).
Data Format Machine-independent binary format. Human-readable text format (e.g., UTF-8, ISO-8859-1).
Key Method readInt(), readDouble(), readUTF(), etc. readLine(), read().
Typical Use Case - Saving application state to a file.
- Network communication protocols that send numbers.
- Reading files created by C programs.
- Reading configuration files (.properties, .csv).
- Reading log files.
- Reading any text-based data.

In short:

  • Use DataInputStream when the structure and type of the data are important (e.g., "the first 4 bytes are an integer, the next 8 are a double").
  • Use BufferedReader when you are dealing with lines of text and the encoding is what matters
分享:
扫描分享到社交APP
上一篇
下一篇