Java如何将UTF-8编码转为GB2312？-杰瑞科技汇

Of course. Converting a Java String from UTF-8 to GB2312 is a common task, especially when dealing with legacy systems or data that was created on a Chinese version of Windows.

（图片来源网络，侵删）

The core concept is that a Java String is always stored internally as UTF-16. The "encoding" (like UTF-8 or GB2312) only comes into play when you need to serialize the string into a sequence of bytes (e.g., to write to a file, send over a network, or store in a database).

Here are the most common and effective ways to perform this conversion, from the most modern to the classic approach.

Method 1: Using `StandardCharsets` (Java 7+) - Recommended

This is the most modern, concise, and recommended way. It uses the StandardCharsets enum, which guarantees that the charset is supported by the JVM.

The process is:

（图片来源网络，侵删）

Get the bytes of your UTF-8 string using String.getBytes(StandardCharsets.UTF_8).
Create a new String from those bytes, specifying the target encoding: new String(byteArray, StandardCharsets.GB2312).

Important Note: This method will replace any characters that are not supported by GB2312 with the Unicode replacement character (). This is generally the safest behavior to prevent data loss or crashes.

import java.nio.charset.StandardCharsets;
public class Utf8ToGb2312Converter {
    public static void main(String[] args) {
        // This string contains characters that are not in GB2312, like '€' and 'ñ'.
        String originalUtf8String = "你好，世界！Hello World! €ñ";
        System.out.println("Original UTF-8 String: " + originalUtf8String);
        System.out.println("Original UTF-8 Bytes: " + bytesToHex(originalUtf8String.getBytes(StandardCharsets.UTF_8)));
        // --- Conversion Process ---
        try {
            // 1. Get the bytes from the UTF-8 string
            byte[] utf8Bytes = originalUtf8String.getBytes(StandardCharsets.UTF_8);
            // 2. Create a new String from the bytes, specifying the GB2312 charset
            String gb2312String = new String(utf8Bytes, StandardCharsets.GB2312);
            System.out.println("\nConverted GB2312 String: " + gb2312String);
            System.out.println("Converted GB2312 Bytes: " + bytesToHex(gb2312String.getBytes(StandardCharsets.GB2312)));
        } catch (Exception e) {
            // StandardCharsets enum guarantees the charset is supported,
            // so this exception is unlikely unless StandardCharsets.GB2312 is removed from Java.
            e.printStackTrace();
        }
    }
    // Helper method to print byte arrays in a readable hex format
    private static String bytesToHex(byte[] bytes) {
        StringBuilder sb = new StringBuilder();
        sb.append("[");
        for (byte b : bytes) {
            sb.append(String.format("%02X ", b));
        }
        sb.append("]");
        return sb.toString();
    }
}

Output of the program:

Original UTF-8 String: 你好，世界！Hello World! €ñ
Original UTF-8 Bytes: [E4 BD A0 E5 A5 BD EF BC 8C E4 B8 96 E7 95 8C EF BC 81 48 65 6C 6C 6F 20 57 6F 72 6C 64 21 20 E2 82 AC C3 B1 ]
Converted GB2312 String: 你好，世界！Hello World! �ñ
Converted GB2312 Bytes: [C4 E3 BA C3 A3 BA C3 CA A0 A3 BA CA A1 48 65 6C 6C 6F 20 57 6F 72 6C 64 21 20 3F C3 B1 ]

Notice how the Euro symbol () was replaced with (or the replacement character internally).

Method 2: Using `Charset.forName()` (Pre-Java 7)

This method works on all versions of Java but is slightly more verbose. It's functionally identical to Method 1.

（图片来源网络，侵删）

Note: This approach will throw an UnsupportedCharsetException if the JVM does not support GB2312. While most modern JVMs do, it's a possibility to be aware of.

import java.nio.charset.Charset;
import java.nio.charset.UnsupportedCharsetException;
public class Utf8ToGb2312ConverterLegacy {
    public static void main(String[] args) {
        String originalUtf8String = "你好，世界！Hello World!";
        System.out.println("Original UTF-8 String: " + originalUtf8String);
        try {
            // 1. Define the charsets
            Charset utf8Charset = Charset.forName("UTF-8");
            Charset gb2312Charset = Charset.forName("GB2312");
            // 2. Get the bytes from the UTF-8 string
            byte[] utf8Bytes = originalUtf8String.getBytes(utf8Charset);
            // 3. Create a new String from the bytes, specifying the GB2312 charset
            String gb2312String = new String(utf8Bytes, gb2312Charset);
            System.out.println("Converted GB2312 String: " + gb2312String);
        } catch (UnsupportedCharsetException e) {
            System.err.println("Error: GB2312 charset is not supported by this JVM.");
            e.printStackTrace();
        }
    }
}

Method 3: Using `CharsetEncoder` and `CharsetDecoder` (Advanced)

This is the most powerful and flexible method, giving you fine-grained control over the conversion process, especially for handling unsupported characters.

You can configure the encoder/decoder to:

Report errors: Throw an exception when an unmappable character is found.
Replace characters: Automatically replace unmappable characters (this is what String's constructor does by default).
Ignore characters: Silently skip unmappable characters.

import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.CodingErrorAction;
public class Utf8ToGb2312Advanced {
    public static void main(String[] args) {
        String originalUtf8String = "你好，世界！Hello World! €ñ";
        System.out.println("Original UTF-8 String: " + originalUtf8String);
        Charset utf8Charset = Charset.forName("UTF-8");
        Charset gb2312Charset = Charset.forName("GB2312");
        // Create an encoder that converts from GB2312 to bytes
        CharsetEncoder encoder = gb2312Charset.newEncoder();
        // Configure the error handling strategy
        encoder.onMalformedInput(CodingErrorAction.REPORT); // Report malformed input
        encoder.onUnmappableCharacter(CodingErrorAction.REPLACE); // Replace unmappable chars
        // Create a decoder that converts from bytes to GB2312 characters
        CharsetDecoder decoder = gb2312Charset.newDecoder();
        try {
            // 1. Wrap the source string in a CharBuffer
            CharBuffer charBuffer = CharBuffer.wrap(originalUtf8String);
            // 2. Encode the CharBuffer to a ByteBuffer (this performs the conversion)
            ByteBuffer byteBuffer = encoder.encode(charBuffer);
            // 3. Decode the ByteBuffer back to a CharBuffer (to see the result)
            CharBuffer resultCharBuffer = decoder.decode(byteBuffer);
            // 4. Convert the CharBuffer back to a String
            String gb2312String = resultCharBuffer.toString();
            System.out.println("Converted GB2312 String: " + gb2312String);
        } catch (CharacterCodingException e) {
            System.err.println("Character coding error during conversion.");
            e.printStackTrace();
        }
    }
}

Summary: Which Method to Use?

Method	Pros	Cons	Best For
`StandardCharsets`	- Modern, concise, type-safe. - Guaranteed to work on Java 7+. - Clean and readable.	- Requires Java 7 or later.	Most use cases in modern Java applications. This is the recommended default.
`Charset.forName()`	- Works on all Java versions. - Standard API.	- Can throw `UnsupportedCharsetException`. - Slightly more verbose.	Legacy codebases or projects that must run on pre-Java 7 environments.
`CharsetEncoder/Decoder`	- Most powerful and flexible. - Granular control over error handling (REPORT, REPLACE, IGNORE).	- More complex and verbose. - Overkill for simple conversions.	Advanced scenarios where you need to log errors on unmappable characters or implement custom conversion logic.

Java如何将UTF-8编码转为GB2312？

Method 1: Using `StandardCharsets` (Java 7+) - Recommended

Method 2: Using `Charset.forName()` (Pre-Java 7)

Method 3: Using `CharsetEncoder` and `CharsetDecoder` (Advanced)

Summary: Which Method to Use?

99ANYc3cd6

Linux下Python如何调用libsvm？

Java JSON如何转List？

Python字符串如何处理ASCII字符？

哪里能下载InDesign视频教程？

如何用PS做出惊艳艺术字？

HDR Light Studio教程如何快速上手？

Tracer False Python，如何解决False值问题？

catia v5r20教程下载

CloudCompare中文教程怎么学？入门到精通指南？

Adobe Audition 3.0教程该怎么学？

Python HTTPError 302如何解决与重定向？

Photoshop最好的教程在哪里找？

虚拟机Kali Linux安装步骤详解？

Java单例synchronized如何高效保证线程安全？

Python中AccuracyScore如何正确计算？

Python memcached能用UDP吗？

Java如何将UTF-8编码转为GB2312？

Method 1: Using StandardCharsets (Java 7+) - Recommended

Method 2: Using Charset.forName() (Pre-Java 7)

Method 3: Using CharsetEncoder and CharsetDecoder (Advanced)

Summary: Which Method to Use?

相关推荐

HDR Light Studio教程如何快速上手？

Method 1: Using `StandardCharsets` (Java 7+) - Recommended

Method 2: Using `Charset.forName()` (Pre-Java 7)

Method 3: Using `CharsetEncoder` and `CharsetDecoder` (Advanced)