views:

849

answers:

4

I'm trying to create a loose word wrapping system via a regex in Perl. What I would like is about every 70 characters or so to check for the next whitespace occurrence and replace that space with a newline, and then do this for the whole string. The string I'm operating on may already have newlines in it already, but the amount of text between newlines tends to be very lengthy.

I'd like to avoid looping one character at a time or using substr if I can, and I would prefer to edit this string in place as opposed to creating new string objects. These are just preferences, though, and if I can't achieve what I'm looking for without breaking these preferences then that's fine.

Thoughts?

+4  A: 
s/(.{70}[^\s]*)\s+/$1\n/

Consume the first 70 characters, then stop at the next whitespace, capturing everything in the process. Then, emit the captured string, omitting the whitespace at the end, adding a newline.

This doesn't guarantee your lines will cut off strictly at 80 characters or something. There's no guarantee the last word it consumes won't be a billion characters long.

Welbog
I think that would be better as .{70,80}\s+, so that if you get " as in a " starting with the space at 71, you get a tighter wrapping.
Axeman
+13  A: 

Look at modules like Text::Wrap or Text::Autoformat.

Depending on your needs, even the GNU core utility fold(1) may be an option.

fgm
That's probably the best way--except for some of the archaic syntax.
Axeman
+5  A: 

Welbog's answer wraps at the first space after 70 characters. This has the flaw that long words beginning close to the end of the line make an overlong line. I would suggest instead wrapping at the last space within the first, say, 81 characters, or wrapping at the first space if you have a >80 character "word", so that only truly unbreakable lines are overlong:

s/(.{1,79}\S|\S+)\s+/$1\n/mg;
ysth
D'oh! And I've even done this type of thing numerous times.
Axeman
+1  A: 

You can get much, much more control and reliability by using Text::Format

use Text::Format;
print Text::Format->new({columns => 70})->format($text);
cubabit