Java字符串转字节数组用什么方法？-杰瑞科技汇

Of course! Converting a Java String to a byte[] (array of bytes) is a common task, but it's crucial to understand the character encoding involved. If you get this wrong, you can introduce subtle and hard-to-find bugs, especially when dealing with non-English characters.

（图片来源网络，侵删）

Here’s a complete guide covering the best ways to do it, the pitfalls to avoid, and how to handle different encodings.

The Core Concept: Character Encoding

A String in Java is an abstract sequence of characters. A byte[] is a sequence of raw bytes. To convert one to the other, you need a character encoding, which is essentially a set of rules for mapping characters to byte values.

ASCII: A simple 7-bit encoding for English characters. It's a subset of UTF-8.
UTF-8 (Unicode Transformation Format - 8-bit): The modern standard. It can represent every character in the Unicode standard. It's backward-compatible with ASCII and is the recommended encoding for almost all applications.
ISO-8859-1 (Latin-1): An 8-bit encoding that covers Western European languages. It maps each byte directly to a character, which can be useful for some specific use cases but is generally not recommended for general text.

Golden Rule: Always specify the character encoding explicitly. Never rely on the platform's default, as it can change from system to system (e.g., Windows often uses Cp1252, while Linux and macOS often use UTF-8).

Method 1: The Modern & Recommended Way (`StandardCharsets`)

Since Java 7, the java.nio.charset.StandardCharsets class provides predefined Charset objects for common encodings. This is the cleanest, safest, and most readable approach.

（图片来源网络，侵删）

For UTF-8 (Most Common)

import java.nio.charset.StandardCharsets;
public class StringToBytes {
    public static void main(String[] args) {
        String text = "Hello, 世界!"; // String with English and Chinese characters
        // The recommended way using StandardCharsets.UTF_8
        byte[] utf8Bytes = text.getBytes(StandardCharsets.UTF_8);
        // You can also use the charset's name as a String (less preferred)
        // byte[] utf8Bytes = text.getBytes("UTF-8");
        System.out.println("Original String: " + text);
        System.out.println("Byte array length: " + utf8Bytes.length);
        // Note: The length is 13, not 9, because Chinese characters take multiple bytes in UTF-8.
        // H e l l o ,   世 界 !  -> 1+1+1+1+1+1+1+3+3+1 = 13 bytes
    }
}

For Other Encodings

You can easily switch to other standard encodings like ISO_8859_1 or US_ASCII.

import java.nio.charset.StandardCharsets;
public class StringToBytesOtherEncodings {
    public static void main(String[] args) {
        String text = "Hello, 世界!";
        // Using ISO-8859-1 (Latin-1)
        // This will fail to represent the Chinese characters correctly,
        // replacing them with the '?' character.
        byte[] latin1Bytes = text.getBytes(StandardCharsets.ISO_8859_1);
        System.out.println("ISO-8859-1 Bytes Length: " + latin1Bytes.length); // Will be 10
        // Using US-ASCII
        // This will also fail for non-ASCII characters.
        byte[] asciiBytes = text.getBytes(StandardCharsets.US_ASCII);
        System.out.println("US-ASCII Bytes Length: " + asciiBytes.length); // Will be 10
    }
}

Method 2: The Classic Way (`String.getBytes(String charsetName)`)

Before Java 7, or if you need to support a custom encoding not in StandardCharsets, you could pass the encoding name as a String to the getBytes() method.

Warning: This method throws an UnsupportedEncodingException if the specified charset name is not supported by the JVM. While this is rare for standard names like "UTF-8", it's a checked exception you must handle.

import java.io.UnsupportedEncodingException;
public class StringToBytesClassic {
    public static void main(String[] args) {
        String text = "Hello, 世界!";
        try {
            // Specify the encoding by its name
            byte[] utf8Bytes = text.getBytes("UTF-8");
            System.out.println("Original String: " + text);
            System.out.println("Byte array (from classic method): " + java.util.Arrays.toString(utf8Bytes));
        } catch (UnsupportedEncodingException e) {
            // This block will only run if the JVM doesn't support "UTF-8",
            // which is extremely unlikely.
            System.err.println("UTF-8 encoding is not supported on this JVM.");
            e.printStackTrace();
        }
    }
}

Method 3: Getting the Default Charset (Usually a Bad Idea)

You can call getBytes() with no arguments. This uses the JVM's platform-specific default charset.

（图片来源网络，侵删）

public class StringToBytesDefault {
    public static void main(String[] args) {
        String text = "Hello, 世界!";
        // Uses the platform's default charset. AVOID THIS for most applications.
        byte[] defaultBytes = text.getBytes();
        System.out.println("Original String: " + text);
        System.out.println("Byte array length (using default charset): " + defaultBytes.length);
        System.out.println("Default charset name: " + java.nio.charset.Charset.defaultCharset());
    }
}

Why is this bad?

Non-portable: Code that works on your Linux machine (default: UTF-8) might break on a Windows machine (default: often Cp1252) or an older IBM mainframe.
Unpredictable: You don't know what encoding you're getting, which can lead to data corruption when the bytes are read back later.

Complete Example: String to Bytes and Back to String

This example shows the full cycle and highlights why encoding is so important.

import java.nio.charset.StandardCharsets;
public class FullConversionExample {
    public static void main(String[] args) {
        String originalText = "这是一个测试。"; // "This is a test." in Chinese
        // 1. Convert String to bytes using UTF-8
        byte[] utf8Bytes = originalText.getBytes(StandardCharsets.UTF_8);
        System.out.println("1. Original String: " + originalText);
        System.out.println("   -> Converted to " + utf8Bytes.length + " UTF-8 bytes.");
        // 2. Convert bytes back to a String using the SAME encoding
        String reconstructedFromUtf8 = new String(utf8Bytes, StandardCharsets.UTF_8);
        System.out.println("\n2. Reconstructed from UTF-8 bytes: " + reconstructedFromUtf8);
        System.out.println("   -> Are they equal? " + originalText.equals(reconstructedFromUtf8)); // true
        // 3. Demonstrate what happens with the WRONG encoding
        //    Let's pretend we received these bytes and incorrectly assumed they were Latin-1
        String incorrectlyReconstructed = new String(utf8Bytes, StandardCharsets.ISO_8859_1);
        System.out.println("\n3. INCORRECTLY reconstructed as Latin-1: " + incorrectlyReconstructed);
        System.out.println("   -> Are they equal? " + originalText.equals(incorrectlyReconstructed)); // false
    }
}

Summary: Which Method Should I Use?

Method	When to Use	Pros	Cons
`getBytes(StandardCharsets.UTF_8)`	Almost always. This is the default, modern, and safest choice.	Clean, no exceptions, readable, portable.	Requires Java 7 or newer.
`getBytes("UTF-8")`	When you need to support a custom encoding or are on Java 6.	Works on older Java versions.	Throws `UnsupportedEncodingException` (must be handled).
`getBytes()` (no argument)	Almost never. Only for very specific, platform-dependent utilities.	Simplest syntax.	Non-portable, unreliable, behavior depends on the JVM.
`new String(bytes, charset)`	To convert a `byte[]` back to a `String`. Must use the same encoding.	Reconstructs the string correctly if the encoding is known.	Fails silently or incorrectly if the wrong encoding is used.

Java字符串转字节数组用什么方法？

The Core Concept: Character Encoding

Method 1: The Modern & Recommended Way (`StandardCharsets`)

For UTF-8 (Most Common)

For Other Encodings

Method 2: The Classic Way (`String.getBytes(String charsetName)`)

Method 3: Getting the Default Charset (Usually a Bad Idea)

Complete Example: String to Bytes and Back to String

Summary: Which Method Should I Use?

99ANYc3cd6

java properties遍历

codewarrior安装教程

Java webservice代理如何高效调用与调试？

python django教程

Matlab2025b安装教程详细步骤是怎样的？

androidkiller安装教程

java hashcode equals

Java调用WebService客户端，如何快速实现？

Python OpenCV如何高效识别二维码？

python xmlrpc server

Java String与Document如何高效关联？

Python MySQLdb连接10061错误怎么解决？

Java webservice WSDL如何生成与调用？

MyEclipse 2025如何支持Python开发？

Java Server与Client如何高效通信？

Autodesk 3ds Max教程从哪学？新手如何快速入门？

Java字符串转字节数组用什么方法？

The Core Concept: Character Encoding

Method 1: The Modern & Recommended Way (StandardCharsets)

For UTF-8 (Most Common)

For Other Encodings

Method 2: The Classic Way (String.getBytes(String charsetName))

Method 3: Getting the Default Charset (Usually a Bad Idea)

Complete Example: String to Bytes and Back to String

Summary: Which Method Should I Use?

相关推荐

androidkiller安装教程

Method 1: The Modern & Recommended Way (`StandardCharsets`)

Method 2: The Classic Way (`String.getBytes(String charsetName)`)