Java bytes与string如何高效转换？-杰瑞科技汇

Of course! This is a very common and important task in Java, especially when dealing with network protocols, file I/O, or cryptography. The core challenge is that a String in Java is a sequence of characters (Unicode), while a byte[] (array of bytes) is a sequence of 8-bit values.

（图片来源网络，侵删）

The conversion between them requires specifying a character encoding, which defines how characters are mapped to byte sequences and vice-versa.

Here’s a complete guide covering the best practices, common pitfalls, and code examples.

The Golden Rule: Always Specify the Encoding

Never rely on the platform's default encoding (e.g., String.getBytes() or new String(bytes) without a charset). This leads to the "It works on my machine" problem because different operating systems (Windows, Linux, macOS) can have different default encodings (like Cp1252, UTF-8, etc.).

Always be explicit about the character encoding. The modern, standard, and recommended encoding is UTF-8.

（图片来源网络，侵删）

Converting a `String` to a `byte[]`

You need to encode the characters of the string into a sequence of bytes.

Method 1: The Recommended Way (Using `StandardCharsets`)

This is the most modern and readable approach. It's available since Java 7.

import java.nio.charset.StandardCharsets;
public class StringToBytes {
    public static void main(String[] args) {
        String originalString = "Hello, 世界!"; // A string with non-ASCII characters
        // --- ENCODING: String to byte[] ---
        // Use the try-with-resources block to ensure the Charset is available.
        // StandardCharsets.UTF_8 is a constant, so this is generally safe.
        byte[] utf8Bytes = originalString.getBytes(StandardCharsets.UTF_8);
        System.out.println("Original String: " + originalString);
        System.out.println("Byte array length: " + utf8Bytes.length);
        System.out.println("Byte array content: " + java.util.Arrays.toString(utf8Bytes));
        // Output: [72, 101, 108, 108, 111, 44, 32, -28, -72, -83, -26, -113, -85, 33]
    }
}

Method 2: The Classic Way (Using `Charset.forName`)

This works on all Java versions (including 6 and older) and is still widely used.

import java.nio.charset.Charset;
public class StringToBytesClassic {
    public static void main(String[] args) {
        String originalString = "Hello, 世界!";
        // --- ENCODING: String to byte[] ---
        byte[] utf8Bytes = originalString.getBytes(Charset.forName("UTF-8"));
        System.out.println("Original String: " + originalString);
        System.out.println("Byte array length: " + utf8Bytes.length);
    }
}

What Happens with the Default Encoding? (The Pitfall)

If you call getBytes() without a charset, it uses the platform's default. This can cause data corruption if the bytes are read on a different system.

（图片来源网络，侵删）

// DANGEROUS - DO NOT DO THIS IN PRODUCTION
// The encoding is platform-dependent!
byte[] defaultBytes = originalString.getBytes();

Converting a `byte[]` to a `String`

You need to decode the sequence of bytes back into characters.

Method 1: The Recommended Way (Using `StandardCharsets`)

Again, this is the preferred method for modern Java.

import java.nio.charset.StandardCharsets;
public class BytesToString {
    public static void main(String[] args) {
        byte[] utf8Bytes = {
            72, 101, 108, 108, 111, 44, 32,
            (byte) 0xE4, (byte) 0xBD, (byte) 0xA0, // 你 in UTF-8
            (byte) 0xE5, (byte) 0xA5, (byte) 0xBD  // 好 in UTF-8
        };
        // --- DECODING: byte[] to String ---
        String decodedString = new String(utf8Bytes, StandardCharsets.UTF_8);
        System.out.println("Decoded String: " + decodedString); // Output: 你好
    }
}

Method 2: The Classic Way (Using `Charset.forName`)

import java.nio.charset.Charset;
public class BytesToStringClassic {
    public static void main(String[] args) {
        byte[] utf8Bytes = {72, 101, 108, 108, 111, 44, 32, -28, -72, -83, -26, -113, -85, 33};
        // --- DECODING: byte[] to String ---
        String decodedString = new String(utf8Bytes, Charset.forName("UTF-8"));
        System.out.println("Decoded String: " + decodedString); // Output: Hello, 世界!
    }
}

What Happens with the Default Encoding? (The Pitfall)

If you create a String from a byte[] without a charset, it will use the platform's default encoding, which can lead to or garbled characters if the bytes were not encoded with that default.

// DANGEROUS - DO NOT DO THIS IN PRODUCTION
// The encoding is platform-dependent!
String corruptedString = new String(utf8Bytes);

Advanced Handling: `CharsetDecoder` and `CharsetEncoder`

For more complex scenarios, like handling invalid bytes or partial data, you can use the java.nio.charset package's low-level encoder and decoder. This gives you fine-grained control over error handling.

Error Handling Strategies:

StandardCharsets.UTF_8: By default, it will replace malformed sequences with the Unicode replacement character ().
CharsetDecoder: You can specify an error handling mode.
- REPORT: Throws a CharacterCodingException on malformed input.
- IGNORE: Silently discards malformed input.
- REPLACE: Replaces malformed input with the replacement character (default behavior).

Example: Using `CharsetDecoder`

import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CodingErrorAction;
import java.nio.charset.StandardCharsets;
public class AdvancedDecoding {
    public static void main(String[] args) throws CharacterCodingException {
        // A byte array with a malformed UTF-8 sequence
        // The byte (byte) 0xC0 is invalid on its own in UTF-8
        byte[] badUtf8Bytes = "H�llo".getBytes(StandardCharsets.ISO_8859_1); // Create some bad bytes
        System.out.println("Original byte array: " + java.util.Arrays.toString(badUtf8Bytes));
        // --- DECODING with explicit error handling ---
        // 1. Default behavior (REPLACE)
        String defaultString = new String(badUtf8Bytes, StandardCharsets.UTF_8);
        System.out.println("Default (REPLACE):   " + defaultString); // Output: H�llo
        // 2. Using a decoder to REPORT the error
        CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
        decoder.onMalformedInput(CodingErrorAction.REPORT); // Throw an exception on bad data
        decoder.onUnmappableCharacter(CodingErrorAction.REPORT);
        try {
            CharBuffer charBuffer = decoder.decode(ByteBuffer.wrap(badUtf8Bytes));
            System.out.println("Decoder (REPORT):    " + charBuffer.toString());
        } catch (CharacterCodingException e) {
            System.err.println("Caught exception with REPORT: " + e.getMessage());
        }
        // 3. Using a decoder to IGNORE the error
        decoder.onMalformedInput(CodingErrorAction.IGNORE);
        CharBuffer ignoredBuffer = decoder.decode(ByteBuffer.wrap(badUtf8Bytes));
        System.out.println("Decoder (IGNORE):    " + ignoredBuffer.toString()); // Output: Hllo
    }
}

Summary: Which Method to Use?

Scenario	Recommended Method	Why?
General Purpose	`string.getBytes(StandardCharsets.UTF_8)` `new String(bytes, StandardCharsets.UTF_8)`	Modern, readable, safe, and efficient. Explicitly uses the standard UTF-8 encoding.
Legacy Code / Java 6	`string.getBytes(Charset.forName("UTF-8"))` `new String(bytes, Charset.forName("UTF-8"))`	Works on older Java versions while still being explicit about the encoding.
Need Error Handling	`CharsetDecoder` / `CharsetEncoder`	Provides fine-grained control over how to handle malformed or unmappable byte sequences.
Never use	`string.getBytes()` `new String(bytes)`	Platform-dependent. Your code will break when deployed to environments with different default encodings.

Java bytes与string如何高效转换？

The Golden Rule: Always Specify the Encoding

Converting a `String` to a `byte[]`

Method 1: The Recommended Way (Using `StandardCharsets`)

Method 2: The Classic Way (Using `Charset.forName`)

What Happens with the Default Encoding? (The Pitfall)

Converting a `byte[]` to a `String`

Method 1: The Recommended Way (Using `StandardCharsets`)

Method 2: The Classic Way (Using `Charset.forName`)

What Happens with the Default Encoding? (The Pitfall)

Advanced Handling: `CharsetDecoder` and `CharsetEncoder`

Error Handling Strategies:

Example: Using `CharsetDecoder`

Summary: Which Method to Use?

99ANYc3cd6

java 使用javascript

java byte integer

labeledpoint python

java nio selector

java websocket 聊天

androidkiller安装教程

java webservice xml

Adobe Premiere教程下载哪里找？安全吗？

Python DBUtils在CSDN如何高效使用？

Python UTC时间戳如何转换与使用？

java fastjson map

Java并发线程，如何高效避免线程安全问题？

Python字符串拼接，用StringBuilder更高效吗？

Eclipse Maven教程，如何快速配置与使用？

Photoshop CS5教程视频哪里下载？

Axis2教程，如何快速上手开发WebService？

Java bytes与string如何高效转换？

The Golden Rule: Always Specify the Encoding

Converting a String to a byte[]

Method 1: The Recommended Way (Using StandardCharsets)

Method 2: The Classic Way (Using Charset.forName)

What Happens with the Default Encoding? (The Pitfall)

Converting a byte[] to a String

Method 1: The Recommended Way (Using StandardCharsets)

Method 2: The Classic Way (Using Charset.forName)

What Happens with the Default Encoding? (The Pitfall)

Advanced Handling: CharsetDecoder and CharsetEncoder

Error Handling Strategies:

Example: Using CharsetDecoder

Summary: Which Method to Use?

相关推荐

androidkiller安装教程

Converting a `String` to a `byte[]`

Method 1: The Recommended Way (Using `StandardCharsets`)

Method 2: The Classic Way (Using `Charset.forName`)

Converting a `byte[]` to a `String`

Method 1: The Recommended Way (Using `StandardCharsets`)

Method 2: The Classic Way (Using `Charset.forName`)

Advanced Handling: `CharsetDecoder` and `CharsetEncoder`

Example: Using `CharsetDecoder`