tags:

views:

89

answers:

5

If I have this string:

hexstring = '001122334455'

How can I split that into a list so the result is:

hexlist = ['00', '11', '22', '33', '44', '55']

I can't think of a nice, pythonic way to do this :/

+8  A: 
>>> [hexstring[i:i+2] for i in range(0,len(hexstring), 2)]
['00', '11', '22', '33', '44', '55']
SilentGhost
+5  A: 

Alternatively:

>>> hexstring = "01234567"
>>> it=iter(hexstring); [a+b for a,b in zip(it, it)]
['01', '23', '45', '67']

Use itertools.izip instead of zip if you're targeting Python 2.x.

This method is a specific version of grouper in the itertools recipe.


Some micro-benchmarks:

$ python2.6 -m timeit -s 'hexstring = "01234567"*500' '[hexstring[i:i+2] for i in xrange(0,len(hexstring), 2)]'
1000 loops, best of 3: 409 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"*500' '[hexstring[i:i+2] for i in range(0,len(hexstring), 2)]'
1000 loops, best of 3: 438 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"*500' 'it=iter(hexstring); [a+b for a,b in zip(it, it)]'
1000 loops, best of 3: 526 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"*500; from itertools import izip' 'it=iter(hexstring); [a+b for a,b in izip(it, it)]'
1000 loops, best of 3: 406 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"*500; import re; r=re.compile(".{1,2}"); f=r.findall' 'f(hexstring)'
1000 loops, best of 3: 458 usec per loop

$ python3.1 -m timeit -s 'hexstring = "01234567"*500' '[hexstring[i:i+2] for i in range(0,len(hexstring), 2)]'
1000 loops, best of 3: 756 usec per loop

$ python3.1 -m timeit -s 'hexstring = "01234567"*500' 'it=iter(hexstring); [a+b for a,b in zip(it, it)]'
1000 loops, best of 3: 414 usec per loop

$ python3.1 -m timeit -s 'hexstring = "01234567"*500; import re; r=re.compile(".{1,2}"); f=r.findall' 'f(hexstring)'
1000 loops, best of 3: 865 usec per loop


$ python2.6 -m timeit -s 'hexstring = "01234567"' '[hexstring[i:i+2] for i in xrange(0,len(hexstring), 2)]'
1000000 loops, best of 3: 1.52 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"' '[hexstring[i:i+2] for i in range(0,len(hexstring), 2)]'
1000000 loops, best of 3: 1.76 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"' 'it=iter(hexstring); [a+b for a,b in zip(it, it)]'
100000 loops, best of 3: 3.78 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"; from itertools import izip' 'it=iter(hexstring); [a+b for a,b in izip(it, it)]'
100000 loops, best of 3: 2.39 usec per loop

$ python2.6 -m timeit -s 'hexstring = "01234567"; import re; r=re.compile(".{1,2}"); f=r.findall' 'f(hexstring)'
1000000 loops, best of 3: 1.45 usec per loop

$ python3.1 -m timeit -s 'hexstring = "01234567"' '[hexstring[i:i+2] for i in range(0,len(hexstring), 2)]'
100000 loops, best of 3: 2.46 usec per loop

$ python3.1 -m timeit -s 'hexstring = "01234567"' 'it=iter(hexstring); [a+b for a,b in zip(it, it)]'
1000000 loops, best of 3: 1.84 usec per loop

$ python3.1 -m timeit -s 'hexstring = "01234567"; import re; r=re.compile(".{1,2}"); f=r.findall' 'f(hexstring)'
100000 loops, best of 3: 2.07 usec per loop

Observation:

  • With long strings on Python 2.6, @SilentGhost's and my method are fastest. Of course you need to use lazy iterators e.g. xrange and izip.
  • With short strings on Python 2.6, @Nick's regular expression is fastest.
  • On Python 3.1 my method is fastest in both cases, but I believe it's because Python 3.x is less optimized.
  • Of course, premature optimization is evil, etc..
KennyTM
+1, nice simple code and good micro-benchmarking job!
Alex Martelli
+1  A: 

A slightly odd way:

map(''.join,zip(hexstring[::2],hexstring[1::2]))
Boojum
+2  A: 

Using regular expressions:

>>> import re
>>> re.findall('.{1, 2}', '001122334455')
['00', '11', '22', '33', '44', '55']
>>> 
Nick D
+1  A: 
hexstring = "01234567"
[''.join(x) for x in zip(*[iter(hexstring)]*2)]
gnibbler