Using Python I need to insert a newline character into a string every 64 characters. In Perl it's easy:
s/(.{64})/$1\n/
How could this be done using regular expressions in Python? Is there a more pythonic way to do it?
Using Python I need to insert a newline character into a string every 64 characters. In Perl it's easy:
s/(.{64})/$1\n/
How could this be done using regular expressions in Python? Is there a more pythonic way to do it?
without regexp:
def insert_newlines(string, every=64):
lines = []
for i in xrange(0, len(string), every):
lines.append(string[i:i+every])
return '\n'.join(lines)
shorter but less readable (imo):
def insert_newlines(string, every=64):
return '\n'.join(string[i:i+every] for i in xrange(0, len(string), every))
I'd go with:
import textwrap
s = "0123456789"*100
print '\n'.join(textwrap.wrap(s, 64))
taking @J.F. Sebastian's solution one step further, and this is nearly criminal :-)
import textwrap
s = "0123456789"*100
print textwrap.fill(s, 64)
look ma... no regexes! because as you know... http://regex.info/blog/2006-09-15/247
thanks for introducing us to textwrap
module... although it's been in Python since 2.3, i've never been aware of it until now (yes, i'll admit that publically)!!
Tiny, not nice:
"".join(s[i:i+64] + "\n" for i in xrange(0,len(s),64))
I suggest the following method:
"\n".join(re.findall("(?s).{,64}", s))[:-1]
This is, more-or-less, the non-RE method taking advantage of the RE engine for the loop.
On a very slow computer I have as a home server, this gives:
$ python -m timeit -s 's="0123456789"*100; import re' '"\n".join(re.findall("(?s).{,64}", s))[:-1]'
10000 loops, best of 3: 130 usec per loop
AndiDog's method:
$ python -m timeit -s "s='0123456789'*100; import re" 're.sub("(?s)(.{64})", r"\1\n", s)'
1000 loops, best of 3: 800 usec per loop
gurney alex's 2nd/Michael's method:
$ python -m timeit -s "s='0123456789'*100" '"\n".join(s[i:i+64] for i in xrange(0, len(s), 64))'
10000 loops, best of 3: 148 usec per loop
I don't consider the textwrap
method to be correct for the specification of the question, so I won't time it.
Changed answer because it was incorrect (shame on me!)
Just for the fun of it, the RE-free method using itertools
. It rates third in speed, and it's not Pythonic (too lispy):
"\n".join(
it.imap(
s.__getitem__,
it.imap(
slice,
xrange(0, len(s), 64),
xrange(64, len(s)+1, 64)
)
)
)
$ python -m timeit -s 's="0123456789"*100; import itertools as it' '"\n".join(it.imap(s.__getitem__, it.imap(slice, xrange(0, len(s), 64), xrange(64, len(s)+1, 64))))'
10000 loops, best of 3: 182 usec per loop