杰瑞科技汇

Java中string.split()方法怎么用?

Of course! The String.split() method in Java is a fundamental and powerful tool for breaking a string into an array of substrings based on a delimiter.

Java中string.split()方法怎么用?-图1
(图片来源网络,侵删)

Here’s a complete guide covering its usage, common pitfalls, and best practices.


Basic Syntax

The split() method is part of the java.lang.String class. It has two common overloads:

// 1. Splits the string around matches of the given regular expression.
public String[] split(String regex)
// 2. Splits the string around matches of the given regular expression,
// with a limit on the number of substrings to return.
public String[] split(String regex, int limit)

Key Point: The delimiter is a regular expression, not just a simple character. This gives it great power but also leads to common mistakes.


Simple Examples

Let's start with straightforward cases where the delimiter is a simple character.

Java中string.split()方法怎么用?-图2
(图片来源网络,侵删)

Example 1: Splitting by a Space

String text = "Java is fun to learn";
String[] words = text.split(" ");
for (String word : words) {
    System.out.println(word);
}

Output:

Java
is
fun
to
learn

Example 2: Splitting by a Comma

String csvData = "apple,banana,cherry,date";
String[] fruits = csvData.split(",");
for (String fruit : fruits) {
    System.out.println(fruit);
}

Output:

apple
banana
cherry
date

The limit Parameter

The second split(String regex, int limit) parameter controls the number of substrings returned.

  • limit > 0: The array will have at most limit entries. The last substring will contain the rest of the string.
  • limit < 0: The array can have any number of entries (this is the same as not using the limit parameter).
  • limit == 0: The array can have any number of entries, but trailing empty strings are discarded.

Example 3: Using the limit Parameter

String text = "a,b,c,d,e";
// Case 1: limit > 0 (e.g., 3)
// The string is split into a maximum of 3 parts.
String[] parts1 = text.split(",", 3);
System.out.println("Limit 3: " + Arrays.toString(parts1));
// Output: Limit 3: [a, b, c,d,e]
// Case 2: limit < 0 (e.g., -1)
// No limit, all parts are included.
String[] parts2 = text.split(",", -1);
System.out.println("Limit -1: " + Arrays.toString(parts2));
// Output: Limit -1: [a, b, c, d, e]
// Case 3: limit == 0
// No limit, and trailing empty strings are removed.
String textWithTrailingEmpty = "a,b,,,"; // Note the trailing commas
String[] parts3 = textWithTrailingEmpty.split(",", 0);
System.out.println("Limit 0: " + Arrays.toString(parts3));
// Output: Limit 0: [a, b]

The Regular Expression Gotcha

This is the most important concept to understand with split(). Special characters in regular expressions must be escaped with a backslash (\).

Java中string.split()方法怎么用?-图3
(图片来源网络,侵删)
Special Character Meaning in Regex How to Escape in Java String
Any character "\\."
OR "\\|"
Zero or more "\\*"
One or more "\\+"
Zero or one "\\?"
^ Start of string "\\^"
End of string "\\$"
Grouping "\\("
[ ] Character class "\\[""]"`
Quantifier "\\{""}"`
\ Escape character "\\\\"

Example 4: Splitting by a Period ()

A common mistake is to try splitting a file name by its extension.

WRONG (doesn't work as expected):

String fileName = "document.txt";
// The regex "." means "match any character"
String[] parts = fileName.split(".");
// This splits into ['d', 'o', 'c', 'u', 'm', 'e', 'n', 't', 't', 'x', 't']
// because every character is a delimiter.

CORRECT (must escape the period):

String fileName = "document.txt";
// The regex "\\." means "match a literal period character"
String[] parts = fileName.split("\\.");
for (String part : parts) {
    System.out.println(part);
}

Output:

document
txt

Example 5: Splitting by a Pipe ()

WRONG:

String text = "apple|banana|cherry";
// "|" is a regex OR operator. This will split on "apple", "banana", or "cherry".
// It's an empty split.
String[] parts = text.split("|"); // Splits on every character!

CORRECT (must escape the pipe):

String text = "apple|banana|cherry";
String[] parts = text.split("\\|");
for (String part : parts) {
    System.out.println(part);
}

Output:

apple
banana
cherry

Handling Consecutive Delimiters

By default, split() treats consecutive delimiters as a single delimiter. For example, "a,,b" will be split into ["a", "b"].

If you want to get empty strings for consecutive delimiters, you must use the limit parameter of 0.

String data = "a,,b,c,,,";
// Default behavior: consecutive delimiters are treated as one.
String[] parts1 = data.split(",");
System.out.println("Default split: " + Arrays.toString(parts1));
// Output: Default split: [a, b, c]
// To get empty strings for consecutive delimiters, use limit=0.
String[] parts2 = data.split(",", 0);
System.out.println("Split with limit=0: " + Arrays.toString(parts2));
// Output: Split with limit=0: [a, , b, c, , ,]

Alternative: StringTokenizer

For very simple, non-regex-based splitting, you can use StringTokenizer. It's generally considered legacy but can be slightly faster if you don't need the power of regular expressions.

Key differences:

  • StringTokenizer is not a regex-based splitter.
  • It does not return empty tokens for consecutive delimiters.
  • It has methods like hasMoreTokens() and nextToken().
import java.util.StringTokenizer;
String text = "Java is fun";
StringTokenizer tokenizer = new StringTokenizer(text);
while (tokenizer.hasMoreTokens()) {
    System.out.println(tokenizer.nextToken());
}

Output:

Java
is
fun

Summary and Best Practices

Method When to Use Pros Cons
String.split() The default, go-to choice. Use it for almost all cases. Very powerful (regex), concise, flexible with the limit parameter. Can be slow for very large strings or complex regexes. The regex escaping can be tricky.
StringTokenizer When you have a very simple delimiter (like a space or comma) and are in a performance-critical loop where regex overhead is a problem. Fast, simple API. Legacy API, less flexible (no limit like split), does not handle empty tokens or complex delimiters well.
Scanner When you are reading from a stream (like a file or System.in) and want to parse tokens sequentially. Excellent for stream-based parsing. Overkill for a simple string split.

Final Recommendation: Use String.split() for 99% of your string splitting needs in Java. Just be mindful of the regular expression rules and remember to escape special characters like , , , etc.

分享:
扫描分享到社交APP
上一篇
下一篇