Note: this answer is based on an older revision of the question
In Java, I think something like this is what you want:
String[] tests = {
"One Two 1 & 2",
"Boeing 737 2, 4 & 6",
"Lucky 7",
"MI6 agent 007, 006",
"2010-05 26, 27 & 28"
};
for (String test : tests) {
String[] parts = test.split("(?=\\d+(, \\d+)*( & \\d+)?$)", 2);
for (String number : parts[1].split("\\D+")) {
System.out.println(parts[0] + number);
}
}
This prints: (as seen on ideone.com)
One Two 1
One Two 2
Boeing 737 2
Boeing 737 4
Boeing 737 6
Lucky 7
MI6 agent 007
MI6 agent 006
2010-05 26
2010-05 27
2010-05 28
Essentially we use lookahead to split where the special number sequence begins, limiting the split into 2 parts. The special number sequence is then split on any sequence of non-digits \D+
.
The pattern for the special number sequence, as shown in the lookahead, is:
\d+(, \d+)*( & \d+)?$
API references
String[] split(String regex, int limit)
- The
limit
parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n
is greater than zero then the pattern will be applied at most n - 1
times, the array's length will be no greater than n
, and the array's last entry will contain all input beyond the last matched delimiter.
See also
A single replaceAll
solution
If, for whatever reason, you insist on doing this in one swooping replaceAll
, you can write something like this:
String[] tests = {
"One Two 1 & 2",
"Boeing 737 2, 4 & 6",
"Lucky 7",
"MI6 agent 007, 006",
"2010-05 26, 27 & 28",
};
String sequence = "\\d+(?:, \\d+)*(?: & \\d+)?$";
for (String test : tests) {
System.out.println(
test.replaceAll(
"^.*?(?=sequence)|(?<=(?=(.*?)(?=sequence))^.*)(\\d+)(\\D+)?"
.replace("sequence", sequence),
"$1$2$3"
)
);
}
The output (as seen on on ideone.com):
One Two 1 & One Two 2
Boeing 737 2, Boeing 737 4 & Boeing 737 6
Lucky 7
MI6 agent 007, MI6 agent 006
2010-05 26, 2010-05 27 & 2010-05 28
This uses triple-nested assertions, including the infinite-length lookbehind feabug in Java. I wouldn't recommend using it, but there it is.