views:

121

answers:

3

Hello,

I have a String which can be very long so I want to split it in an array of Strings so in each position the string's length should be lower than a specified number. The split can only be done in a white space or after a punctuation symbol, because I need that each fragments makes sense when it is read. It is something like a word processor so when the word doesn't fit in the current line it goes to the next line.

I had thought in splitting the given String in the white spaces or punctuation symbols with a regular expression and the joining them controlling the length of the StringBuilder, but I think splitting in all the words and then joining them may not be very efficient.

What would be the most efficient way to perform this? Is there any library which could help in this job?

Thank you very much.

+4  A: 

Here is something you can try:

Let's say you have to split in strings that are at most n characters long.

  • Start a position 0. Move the cursor n characters forwards.
  • Move the cursor 1 character backwards until you find a character on which on can split.
  • Split the string, store the first part, and reiterate on the second part.

I think this might be more efficient than splitting then joining your strings.

Vivien Barousse
A: 

I'd go with the simplest approach I can think of and worry of possible efficiency or performance issues later if they really are an issue.

Mikko Wilkman
+1  A: 

Apache commons might help with this.

http://commons.apache.org/lang/api-2.5/index.html

See WordUtils static function static String wrap(String str, int wrapLength) - Wraps a single line of text, identifying words by ' '.

Its also open source, so if you need something more specific just look at the source...

JH