views:

142

answers:

4

I am trying to use a simple split to break up the following string: 00-00000

My expression is: ^([0-9][0-9])(-)([0-9])([0-9])([0-9])([0-9])([0-9])

And my usage is:

String s = "00-00000";

String pattern = "^([0-9][0-9])(-)([0-9])([0-9])([0-9])([0-9])([0-9])";

String[] parts = s.split(pattern);

If I play around with the Pattern and Matcher classes I can see that my pattern does match and the matcher tells me my groupCount is 7 which is correct. But when I try and split them I have no luck.

+8  A: 

String.split does not use capturing groups as its result. It finds whatever matches and uses that as the delimiter. So the resulting String[] are substrings in between what the regex matches. As it is the regex matches the whole string, and with the whole string as a delimiter there is nothing else left so it returns an empty array.

If you want to use regex capturing groups you will have to use Matcher.group(), String.split() will not do.

krock
+3  A: 

for your example, you could simply do this:

String s = "00-00000";

String pattern = "-";

String[] parts = s.split(pattern);
oezi
A: 

I can not be sure, but I think what you are trying to do is to get each matched group into an array.

    Matcher matcher = Pattern.compile(pattern).matcher();

    if (matcher.matches()) {
        String s[] = new String[matcher.groupCount()) {
           for (int i=0;i<matches.groupCount();i++) {
               s[i] = matcher.group(i);
            }
         }
    }
Nishan
This was pretty close to what I needed. Thanks!
+1  A: 

From the documentation:

String[] split(String regex) -- Returns: the array of strings computed by splitting this string around matches of the given regular expression

Essentially the regular expression is used to define delimiters in the input string. You can use capturing groups and backreferences in your pattern (e.g. for lookarounds), but ultimately what matters is what and where the pattern matches, because that defines what goes into the returned array.

If you want to split your original string into 7 parts using regular expression, then you can do something like this:

    String s = "12-3456";
    String[] parts = s.split("(?!^)");

    System.out.println(parts.length); // prints "7"

    for (String part : parts) {
        System.out.println("Part [" + part + "]");
    } // prints "[1] [2] [-] [3] [4] [5] [6] "

This splits on zero-length matching assertion (?!^), which is anywhere except before the first character in the string. This prevents the empty string to be the first element in the array, and trailing empty string is already discarded because we use the default limit parameter to split.

Using regular expression to get individual character of a string like this is an overkill, though. If you have only a few characters, then the most concise option is to use foreach on the toCharArray():

    for (char ch : "12-3456".toCharArray()) {
        System.out.print("[" + ch + "] ");
    }

This is not the most efficient option if you have a longer string.


Splitting on -

This may also be what you're looking for:

    String s = "12-3456";
    String[] parts = s.split("-");

    System.out.println(parts.length); // prints "2"

    for (String part : parts) {
        System.out.print("[" + part + "] ");
    } // prints "[12] [3456] "
polygenelubricants