杰瑞科技汇

Java FileInputStream如何高效读取大文件?

Of course! Let's dive deep into java.io.FileInputStream. This is a fundamental class in Java for reading raw bytes from a file.

Java FileInputStream如何高效读取大文件?-图1
(图片来源网络,侵删)

What is FileInputStream?

FileInputStream is an input stream for reading raw bytes from a file. It's part of the Java I/O (Input/Output) package (java.io). The "stream" part implies that it reads data sequentially, byte by byte, or in chunks, from the beginning of the file to the end.

Key Characteristics:

  • Purpose: To read binary data (like images, videos, audio, or serialized objects) or text data from a file.
  • Data Type: It reads bytes (byte), not characters (char). This makes it ideal for binary files.
  • Hierarchy: It extends the abstract class InputStream, which defines the basic methods for reading bytes.

How to Use FileInputStream (The Basics)

The core workflow for using FileInputStream involves three steps:

  1. Create an instance, specifying the file path.
  2. Read data from the stream.
  3. Close the stream to release system resources.

Simple Example: Reading a File Byte by Byte

This is the most basic way to read a file. It's not efficient for large files but is great for understanding the concept.

Java FileInputStream如何高效读取大文件?-图2
(图片来源网络,侵删)

Let's assume you have a file named test.txt in the same directory as your Java program with the content: Hello World!.

import java.io.FileInputStream;
import java.io.IOException;
public class FileInputStreamExample {
    public static void main(String[] args) {
        // The try-with-resources statement ensures the stream is closed automatically.
        // This is the recommended modern approach.
        try (FileInputStream fis = new FileInputStream("test.txt")) {
            int byteData;
            // read() returns an integer representing the next byte of data,
            // or -1 if the end of the stream is reached.
            System.out.println("Reading file byte by byte:");
            while ((byteData = fis.read()) != -1) {
                // Cast the integer to a char to print the character representation
                System.out.print((char) byteData);
            }
            System.out.println("\nFile reading finished.");
        } catch (IOException e) {
            // This block will catch exceptions like FileNotFoundException
            // (which is a subclass of IOException) or other I/O errors.
            System.err.println("An error occurred while reading the file: " + e.getMessage());
        }
    }
}

Output:

Reading file byte by byte:
Hello World!
File reading finished.

Efficient Example: Reading a File into a Byte Array

Reading a file one byte at a time is inefficient. A much better approach is to read the file in chunks (buffers) using a byte array.

import java.io.FileInputStream;
import java.io.IOException;
public class FileInputStreamBufferExample {
    public static void main(String[] args) {
        // The file to read
        String filePath = "test.txt";
        // Create a buffer (a byte array) to hold the data
        byte[] buffer = new byte[1024]; // Read 1024 bytes at a time
        try (FileInputStream fis = new FileInputStream(filePath)) {
            int bytesRead;
            System.out.println("Reading file into a buffer...");
            // read(byte[] b) reads up to b.length bytes of data from this input stream
            // into an array of bytes. It returns the number of bytes read, or -1 if the end of the stream is reached.
            while ((bytesRead = fis.read(buffer)) != -1) {
                // Convert the byte array segment to a string and print it
                // The String constructor specifies the character encoding (e.g., UTF-8)
                String chunk = new String(buffer, 0, bytesRead, "UTF-8");
                System.out.print(chunk);
            }
            System.out.println("\nFile reading finished.");
        } catch (IOException e) {
            System.err.println("An error occurred: " + e.getMessage());
        }
    }
}

Constructor Overloads

FileInputStream provides several constructors to create an instance:

  1. FileInputStream(String name)

    • Creates a FileInputStream by opening a connection to an actual file, the file named by the path name in the file system named name.
    • Example: new FileInputStream("C:/data/myfile.dat")
  2. FileInputStream(File file)

    • Creates a FileInputStream by opening a connection to an actual file, the file named by the File object file.

    • This is often preferred as it separates the file system logic from the I/O logic.

    • Example:

      import java.io.File;
      import java.io.FileInputStream;
      import java.io.IOException;
      File file = new File("image.png");
      try (FileInputStream fis = new FileInputStream(file)) {
          // ... read from fis ...
      }
  3. FileInputStream(FileDescriptor fdObj)

    • Creates a FileInputStream by using the file descriptor fdObj.
    • This is an advanced use case, typically used when you already have a file descriptor from another source (like another process).

Key Methods

Method Description
int read() Reads one byte of data from the input stream and returns it as an integer (0-255). Returns -1 if the end of the stream is reached.
int read(byte[] b) Reads up to b.length bytes of data from this input stream into an array of bytes. Returns the number of bytes read, or -1 if the end of the stream is reached.
int read(byte[] b, int off, int len) Reads up to len bytes of data from this input stream into an array of bytes, starting at offset off. Returns the number of bytes read, or -1 if the end of the stream is reached.
long skip(long n) Skips over and discards n bytes of data from the input stream.
int available() Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking.
void close() Closes this file input stream and releases any system resources associated with the stream. This is crucial!

Best Practices and Important Considerations

Resource Management: try-with-resources

Before Java 7, you had to manually close the stream in a finally block to ensure it was always closed, even if an error occurred.

// Old way (pre-Java 7)
FileInputStream fis = null;
try {
    fis = new FileInputStream("file.txt");
    // ... read from fis ...
} catch (IOException e) {
    // handle exception
} finally {
    // It's critical to check if fis is not null before closing
    if (fis != null) {
        try {
            fis.close();
        } catch (IOException e) {
            // handle close exception
        }
    }
}

Since Java 7, the try-with-resources statement is the standard and best practice. It automatically closes any resource that implements the AutoCloseable interface (which FileInputStream does) at the end of the try block. This prevents resource leaks.

// Modern way (Java 7+)
try (FileInputStream fis = new FileInputStream("file.txt")) {
    // ... read from fis ...
} catch (IOException e) {
    // handle exception
} // fis.close() is called automatically here

Character Encoding vs. Byte Streams

FileInputStream reads bytes. If you are reading a text file, you need to convert those bytes into characters using a specific character encoding (like UTF-8, ASCII, etc.).

The String constructor we used earlier is one way to do this: new String(buffer, 0, bytesRead, "UTF-8");

For more complex text processing, it's often better to use InputStreamReader, which bridges the gap between byte streams and character streams.

// Using InputStreamReader to read text with a specific encoding
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
try (
    FileInputStream fis = new FileInputStream("test.txt");
    // Create an InputStreamReader that uses UTF-8 encoding
    InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
) {
    int charData;
    while ((charData = isr.read()) != -1) {
        System.out.print((char) charData);
    }
} catch (IOException e) {
    e.printStackTrace();
}

Exception Handling

The primary exception you'll deal with is IOException. Its subclass, FileNotFoundException, is thrown when the specified file path does not exist or cannot be accessed. The try-with-resources statement makes handling these exceptions clean and safe.

Summary: FileInputStream vs. FileReader

Feature FileInputStream FileReader
Purpose Reads binary data (bytes) from a file. Reads text data (characters) from a file.
Data Unit
分享:
扫描分享到社交APP
上一篇
下一篇