views:

166

answers:

2

Hi I would like to wrap the following comma separated data:

-X, run, abs, absolute, accept, accept, alarm, schedule, atan2, arctangent, bind, binds, binmode, prepare, bless, create, caller, get, chdir, change, chmod, changes, chomp, remove, chop, remove, chown, change, chr, get, chroot, make, close, close, closedir, close, connect, connect, continue, optional, cos, cosine, crypt, one-way, dbmclose, breaks, dbmopen, create, defined, test, delete, deletes, die, raise, do, turn, dump, create, each, retrieve, endgrent, be, endhostent, be, endnetent, be, endprotoent, be, endpwent, be, endservent, be, eof, test, eval, catch, exec, abandon, exists, test, exit, terminate, exp, raise, fcntl, file, fileno, return, flock, lock, fork, create, format, declare, formline, internal, getc, get, getgrent, get, getgrgid, get, getgrnam, get, gethostbyaddr, get, gethostbyname, get, gethostent, get, getlogin, return, getnetbyaddr, get, getnetbyname, get, getnetent, get, getpeername, find, getpgrp, get, getppid, get, getpriority, get, getprotobyname, get, getprotobynumber, get, getprotoent, get, getpwent, get, getpwnam, get, getpwuid, get, getservbyname, get, getservbyport, get, getservent, get, getsockname, retrieve, getsockopt, get, glob, expand, gmtime, convert, goto, create, grep, locate, hex, convert, import, patch, int, get, ioctl, system-dependent, join, join, keys, retrieve, kill, send, last, exit, lc, return, lcfirst, return, length, return, link, create, listen, register, local, create, localtime, convert, log, retrieve, lstat, stat, m//, match, map, apply, mkdir, create, msgctl, SysV, msgget, get, msgrcv, receive, msgsnd, send, my, declare, next, iterate, no, unimport, oct, convert, open, open, opendir, open, ord, find, pack, convert, package, declare, pipe, open, pop, remove, pos, find, print, output, printf, output, prototype, get, push, append, q/STRING/, singly, qq/STRING/, doubly, quotemeta, quote, qw/STRING/, quote, qx/STRING/, backquote, rand, retrieve, read, fixed-length, readdir, get, readlink, determine, recv, receive, redo, start, ref, find, rename, change, require, load, reset, clear, return, get, reverse, flip, rewinddir, reset, rindex, right-to-left, rmdir, remove, s///, replace, scalar, force, seek, reposition, seekdir, reposition, select, reset, semctl, SysV, semget, get, semop, SysV, send, send, setgrent, prepare, sethostent, prepare, setnetent, prepare, setpgrp, set, setpriority, set, setprotoent, prepare, setpwent, prepare, setservent, prepare, setsockopt, set, shift, remove, shmctl, SysV, shmget, get, shmread, read, shmwrite, write, shutdown, close, sin, return, sleep, block, socket, create, socketpair, create, sort, sort, splice, add, split, split, sprintf, formatted, sqrt, square, srand, seed, stat, get, study, optimize, sub, declare, substr, get, symlink, create, syscall, execute, sysread, fixed-length, system, run, syswrite, fixed-length, tell, get, telldir, get, tie, bind, time, return, times, return, tr///, transliterate, truncate, shorten, uc, return, ucfirst, return, umask, set, undef, remove, unlink, remove, unpack, convert, unshift, prepend, untie, break, use, load, utime, set, values, return, vec, test, wait, wait, waitpid, wait, wantarray, get, warn, print, write, print, y///, transliterate,

such that a line break is added at the last comma before a 70 character line length. Preferable this could be done in some kind of bash one liner.

+2  A: 
echo 'your, text, here' | fold -sw 70

That should give you the output you want. Instead of using echo you can pipe it from a file or wherever else you're getting it from, or you can just use the fold command directly and paste it in on stdin.

The "-w 70" in fold tells it to wrap after 70 characters per line, and the -s tells it to wrap on the spaces after each comma.

cecilkorik
+1 Perfect, thanks. SO should allow immediate accepted answers, but there is a two minute delay I guess.
D W
Thanks much, you finally bumped my reputation high enough to post comments! :) And glad I could help!
cecilkorik
Note that that assumes that there are no spaces within elements of the comma-separated list; it's not actually looking for the commas, just looking for spaces.
intuited
This is a good point. fold is (surprisingly) not nearly flexible enough to handle anything other than spaces or hard-wrapping. It's actually a pretty poor tool, it just happens to work adequately in this case. I don't know if there's anything better out there.
cecilkorik
Fortunately my data had commas and spaces. I suppose you could do a sed command first to make sure all commas are preceded by a space. I would be interested in alternatives that would work for just commas, if you are aware of some @intuited and @cecilkorik.
D W
+1  A: 

In response to your comment about cases where spaces might be embedded between commas:

I think you're on the right track with using sed. One option that you would have is to map all spaces to some unused character, then map commas to spaces, fold, then revert the original mappings. But I think this will leave you with weird things like lines that begin with spaces.

So it seems like you would want to just remap any spaces not preceded by a comma to some character or sequence that you know isn't present in the text, and then invert to switch them back.

For example,

echo "$blahblah" | sed 's/\([^,]\) /\1\t/g' | fold -s | sed 's/\t/ /g

would work if there are no tab characters in the text, and if there are not going to be consecutive embedded spaces.

If there are, I think you'll need to use something a bit more complex like

echo "$blahblah" | \
    perl -pe 's/([^, ])( +)/$1 . "_" x length($2)/ge' | \
    fold -s | \
    tr _ ' '

The /ge makes each substitution value be evaluated as a perl expression.

The tr at the end is basically equivalent to the closing sed in the last command.

This assumes that the character "_" is not guaranteed to not occur in your source text. There are certainly better characters to be chosen, e.g. an unused control character like, say, ^V. If you use a modern perl to do the translation at the end you can, I think, use some obscure, multibyte unicode character.

This suggestion is pretty off-the-cuff, and has some obvious problems. For example, it won't break after a comma unless there's a space there. This might not really be what you want. It may well be worth it to do some digging around on CPAN/PyPI/etc. for something more robust. Or you could write your own folding utility...

intuited