views:

528

answers:

7

How to split the string "Thequickbrownfoxjumps" to substrings of equal size in Java. Eg. "Thequickbrownfoxjumps" of 4 equal size should give the output.

["Theq","uick","brow","nfox","jump","s"]

Similar Question:

Split string into equal-length substrings in Scala

+3  A: 

You can use substring from String.class (handling exceptions) or from Apache lang commons (it handles exceptions for you)

static String   substring(String str, int start, int end) 

Put it inside a loop and you are good to go.

pakore
What's wrong with the `substring` method in the standard `String` class?
Grodriguez
The commons version avoids exceptions (out of bounds and such)
Thilo
I see; I would say I prefer to 'avoid exceptions' by controlling the parameters in the calling code instead.
Grodriguez
+15  A: 

Well, it's fairly easy to do this by brute force:

public static List<String> splitEqually(String text, int size) {
    // Give the list the right capacity to start with. You could use an array
    // instead if you wanted.
    List<String> ret = new ArrayList<String>((text.length() + size - 1) / size);

    for (int start = 0; start < text.length(); start += size) {
        ret.add(text.substring(start, Math.min(text.length(), start + size)));
    }
    return ret;
}

I don't think it's really worth using a regex for this.

EDIT: My reasoning for not using a regex:

  • This doesn't use any of the real pattern matching of regexes. It's just counting.
  • I suspect the above will be more efficient, although in most cases it won't matter
  • If you need to use variable sizes in different places, you've either got repetition or a helper function to build the regex itself based on a parameter - ick.
  • The regex provided in another answer firstly didn't compile (invalid escaping), and then didn't work. My code worked first time. That's more a testament to the usability of regexes vs plain code, IMO.
Jon Skeet
@Jon Skeet : Thanks for clearing it but i didn't get your point. "I don't think it's really worth using a regex for this"
org.life.java
Why isn't it worth using a regex? I'm not disagreeing with you I'm just wondering if its more cost or readability etc.
Gage
@org.life.java: Well what's the benefit of using a regex here? You're not really matching patterns as such... you're just getting the substrings blindly. It doesn't seem a good fit for regular expressions to me.
Jon Skeet
@Jon:I asked for regex since i can split the string in just one statement.
Emil
Agree, but if its just for smaller strings and its not often then we should go for this i would say.
org.life.java
@Emil: Actually, you *didn't* ask for a regex. It's in the tags, but nothing in the question itself asks for a regex. You put this method in one place, and then you can split the string in just one *very readable* statement anywhere in your code.
Jon Skeet
@Jon:i'm accepting your answer.If possible please append a regex version too along with the answer.
Emil
Emil this is not what a regex is for. Period.
Chris
@Emil: If you want a one-liner for splitting the string, I'd recommend Guava's `Splitter.fixedLength(4)` as suggested by seanizer.
ColinD
Also, please show an answer that uses XML, and an answer that uses the SimpleDateFormat class. :-) Seriously, some tools just aren't useful for a problem. You might be able to put screws in with a hammer, but wouldn't it make more sense to use a screwdriver? That's what it's for.
Jay
@Jay:come-on you need not be that sarcastic.I'm sure it can be done using regex in just one-line.A fixed length sub-string is also a pattern.What do you say about this answer. http://stackoverflow.com/questions/3760152/split-string-of-equal-lengths-in-java/3761521#3761521 .
Emil
@Emil: I didn't intend that to be rude, just whimsical. The serious part of my point was that while yes, I'm sure you could come up with a Regex to do this -- I see Alan Moore has one that he claims works -- it is cryptic and therefore difficult for a later programmer to understand and maintain. A substring solution can be intuitive and readable. See Jon Skeet's 4th bullet: I agree with that 100%.
Jay
This has me wondering: would it be more efficient to convert to a char array and mod on size? I mean substring is doing a fair amount of counting. Just a thought.
javamonkey79
@javamonkey79: What kind of counting are you thinking about?
Jon Skeet
@Jon: I assumed too much about the implementation of the String class. It doesn't do any counting in order to perform substring as I expected but rather "knows" the indices of the underlying array.
javamonkey79
+2  A: 
public String[] splitInParts(String s, int partLength)
{
    int len = s.length();

    // Number of parts
    int nparts = (len + partLength - 1) / partLength;
    String parts[] = new String[nparts];

    // Break into parts
    int offset= 0;
    int i = 0;
    while (i < nparts)
    {
        parts[i] = s.substring(offset, Math.min(offset + partLength, len));
        offset += partLength;
        i++;
    }

    return parts;
}
Grodriguez
Out of interest, do you have something against `for` loops?
Jon Skeet
A `for` loop is indeed a more 'natural' choice use for this :-) Thanks for pointing this out.
Grodriguez
+1  A: 
public static String[] split(String src, int len) {
    String[] result = new String[(int)Math.ceil((double)src.length()/(double)len)];
    for (int i=0; i<result.length; i++)
        result[i] = src.substring(i*len, Math.min(src.length(), (i+1)*len));
    return result;
}
Saul
Since `src.length()` and `len` are both `int`s, your call `ceiling` isn't accomplishing what you want--check out how some of the other responses are doing it: (src.length() + len - 1) / len
Michael Brewer-Davis
@Michael: Good point. I didn't test it with strings of non-multiple lengths. It's fixed now.
Saul
+10  A: 

This is very easy with Google Guava:

for(final String token :
    Splitter
        .fixedLength(4)
        .split("Thequickbrownfoxjumps")){
    System.out.println(token);
}

Output:

Theq
uick
brow
nfox
jump
s

Or if you need the result as an array, you can use this code:

String[] tokens =
    Iterables.toArray(
        Splitter
            .fixedLength(4)
            .split("Thequickbrownfoxjumps"),
        String.class
    );

Reference:

I played with the idea to create a regex version also, but I couldn't come up with a one-liner for String.split(). And I deleted my previous attempts because the problem was now solved by Alan Moore. But I still agree with the others that regex is not the right tool for this (even though it works).

In short:

Use Guava's one-liner (most elegant imho) or Alan's one-liner (the answer you were looking for) or Jon Skeet's multiliner (probably most efficient).

seanizer
@seanizer:Thanks for the post(For making me aware of guava library method).But i'll have to accept the regex answer http://stackoverflow.com/questions/3760152/split-string-of-equal-lengths-in-java/3761521#3761521 since it doesn't require any 3rd party library and a one-liner.
Emil
Man this is really useful, thanks!
javamonkey79
+3  A: 

If you're using Google's guava general-purpose libraries (and quite honestly, any new Java project probably should be), this is insanely trivial with the Splitter class:

for (String substring : Splitter.fixedLength(4).split(inputString)) {
    doSomethingWith(substring);
}

and that's it. Easy as!

Cowan
+12  A: 

Here's the regex one-liner version:

System.out.println(Arrays.toString(
    "Thequickbrownfoxjumps".split("(?<=\\G.{4})")
));

\G is a zero-width assertion that matches the position where the previous match ended. If there was no previous match, it matches the beginning of the input, the same as \A. The enclosing lookbehind matches the position that's four characters along from the end of the last match.

Both lookbehind and \G are advanced regex features, not supported by all flavors. Furthermore, \G is not implemented consistently across the flavors that do support it. This trick will work (for example) in Java, Perl, .NET and JGSoft, but not in PHP (PCRE), Ruby 1.9+ or TextMate (both Oniguruma).

EDIT: I should mention that I don't necessarily recommend this solution if you have other options. The non-regex solutions in the other answers may be longer, but they're also self-documenting; this one's just about the opposite of that. ;)

Alan Moore
Yup, that's what I was looking for. (+1)
seanizer
@Alan:Is this code tested?I mean i'll accept it if it's tested.I don't have any way to check it until i reach office.
Emil
@Emil yes, it works. But Alan can probably wait until you get to the office :-)
seanizer
@seanizer:It's ok,i'll have to accept the right answer any way.May be he can wait but i don't like postponing.
Emil
Here's a demo on ideone.com: http://ideone.com/oInXz
Alan Moore