views:

84

answers:

5

Java's string split(regex) function splits at all instances of the regex. Python's partition function only splits at the first instance of the given separator, and returns a tuple of {left,separator,right}.

How do I achieve what partition does in Java?

e.g.

"foo bar hello world".partition(" ")

should become

"foo", " ", "bar hello world"
  • Is there an external library which provides this utility already?

  • how would I achieve it without an external library?

  • And can it be achieved without an external library and without Regex?

NB. I'm not looking for split(" ",2) as it doesn't return the separator character.

A: 

Use:

"foo bar hello world".split(" ",2)

By default the delimiter is whitespace

Favonius
Read more carefully - this is not an answer to _this_ question.
Péter Török
this doesn't return the separator which is a requirement of the question.
sprocketonline
+5  A: 

While not exactly what you want, there's a second version of split which takes a "limit" parameter, telling it the maximum number of partitions to split the string into.

So if you called (in Java):

"foo bar hello world".split(" ", 2);

You'd get the array:

["foo", "bar hello world"]

which is more or less what you want, except for the fact that the separator character isn't embedded at index 1. If you really need this last point, you'd need to do it yourself, but hopefully all you specifically wanted was the ability to limit the number of splits.

Andrzej Doyle
retaining the separator character is a requirement.
sprocketonline
@sprocketonline: Ah, I see, I thought the array index was simply the separator *regex* (and thus trivial to splice in) but now I understand it's the sequence of characters that *matched* the regex (very different). In which case, go with polygenelubricants' answer as the simplest way to do this is with explicit use of Matchers.
Andrzej Doyle
A: 

Is there an external library which provides this utility already?

None that I know of.

how would I achieve it without an external library? And can it be achieved without an external library and without Regex?

Sure, that's no problem at all; just use String.indexOf() and String.substring(). However, Java does not have tuple datatype, so you'll have to return an array, List or write your own result class.

Michael Borgwardt
+2  A: 

How about this:

String partition(String string, String separator) {
    String[] parts = string.split(separator, 2);
    return new String[] {parts[0], separator, parts[1]};
}

BTW, you have to add some input/result checks at this :)

splix
+2  A: 

The String.split(String regex, int limit) is close to what you want. From the documentation:

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.

  • If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.
  • If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.
    • If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Here's an example to show these differences (as seen on ideone.com):

static void dump(String[] ss) {
    for (String s: ss) {
        System.out.print("[" + s + "]");
    }
    System.out.println();
}
public static void main(String[] args) {
    String text = "a-b-c-d---";

    dump(text.split("-"));
    // prints "[a][b][c][d]"

    dump(text.split("-", 2));
    // prints "[a][b-c-d---]"

    dump(text.split("-", -1));
    // [a][b][c][d][][][]

}

A partition that keeps the delimiter

If you need a similar functionality to the partition, and you also want to get the delimiter string that was matched by an arbitrary pattern, you can use Matcher, then taking substring at appropriate indices.

Here's an example (as seen on ideone.com):

static String[] partition(String s, String regex) {
    Matcher m = Pattern.compile(regex).matcher(s);
    if (m.find()) {
        return new String[] {
            s.substring(0, m.start()),
            m.group(),
            s.substring(m.end()),
        };
    } else {
        throw new NoSuchElementException("Can't partition!");
    }
}
public static void main(String[] args) {
    dump(partition("james007bond111", "\\d+"));
    // prints "[james][007][bond111]"
}

The regex \d+ of course is any digit character (\d) repeated one-or-more times (+).

polygenelubricants
+1 for being the only correct answer to the question so far! Additional imaginary +1 for the introduction to ideone, which looks very useful.
Andrzej Doyle
@Andrzej: yep, with ideone you can quickly (i) see that it works (ii) edit it and test it yourself
polygenelubricants