I have a big paragraph which I need to split into lines such that each line must not have more than 100 characters and no words must be broken. How would I go about doing this? I guess with regular expressions is the best way but I'm not sure how.
A:
While you should use a library function if you have one, as KennyTM suggested, a simple regex to solve this can be:
.{1,100}\b
This will take 100 characters or less, and will not break words. It would break other characters though, for example the period at the end of a sentence may be parted from the last word (last word<\n>. new line
).
If that's an issue, you can also try:
.{1,99}(\s|.$)
That assures the last character in every match is a white space.
All of these assume you count spaces as characters, and probably don't have newlines in your text (a single paragraph), and don't have word of over 100 characters.
Kobi
2010-02-21 13:35:53
You don't want to use \b there. It will wrap on things such as the apostrophe in don't.
brian d foy
2010-02-21 23:34:20
@Brian - correct. I've mention that and have an alternative.
Kobi
2010-02-22 05:28:45