This somewhat heavy implementation using Matcher.find
instead of split
will also work, although by the time you have to code a for
loop for such a trivial task you might as well drop the regular expressions altogether and use substrings (for similar coding complexity minus the CPU cycles):
import java.util.*;
import java.util.regex.*;
public class StringSplit {
public static void main(String args[]) {
ArrayList<String> result = new ArrayList<String>();
for (Matcher m = Pattern.compile("..").matcher("12345"); m.find(result.isEmpty() ? 0 : m.start() + 1); result.add(m.group()));
System.out.println( result.toString() ); // prints "[12, 23, 34, 45]"
}
}
EDIT1
match()
: the reason why nobody so far has been able to concoct a regular expression like your BONUS_REGEX
lies within Matcher
, which will resume looking for the next group where the previous group ended (i.e. no overlap), as oposed to after where the previous group started -- that is, short of explicitly respecifying the start search position (above). A good candidate for BONUS_REGEX
would have been "(.\\G.|^..)"
but, unfortunately, the \G
-anchor-in-the-middle trick doesn't work with Java's Match
(but works just fine in Perl):
perl -e 'while ("12345"=~/(^..|.\G.)/g) { print "$1\n" }'
12
23
34
45
split()
: as for INSERT_REGEX_HERE
a good candidate would have been (?<=..)(?=..)
(split point is the zero-width position where I have two characters to my right and two to my left), but again, because split
concieves naught of overlap you end up with [12, 3, 45]
(which is close, but no cigar.)
EDIT2
For fun, you can trick split()
into doing what you want by first doubling non-boundary characters (here you need a reserved character value to split around):
Pattern.compile("((?<=.).(?=.))").matcher("12345").replaceAll("$1#$1").split("#")
We can be smart and eliminate the need for a reserved character by taking advantage of the fact that zero-width look-ahead assertions (unlike look-behind) can have an unbounded length; we can therefore split around all points which are an even number of characters away from the end of the doubled string (and at least two characters away from its beginning), producing the same result as above:
Pattern.compile("((?<=.).(?=.))").matcher("12345").replaceAll("$1$1").split("(?<=..)(?=(..)*$)")
Alternatively tricking match()
in a similar way (but without the need for a reserved character value):
Matcher m = Pattern.compile("..").matcher(
Pattern.compile("((?<=.).(?=.))").matcher("12345").replaceAll("$1$1")
);
while (m.find()) {
System.out.println(m.group());
} // prints "12", "23", "34", "45"
Cheers,
V.