tags:

views:

126

answers:

6

Is there any neat trick to slice a binary number into groups of five digits in python?

'00010100011011101101110100010111' => ['00010', '00110', '10111', ... ]

Edit: I want to write a cipher/encoder in order to generate "easy to read over the phone" tokens. The standard base32 encoding has the following disadvantages:

  • Potential to generate accidental f*words
  • Uses confusing chars like chars like 'I', 'L', 'O' (may be confused with 0 and 1)
  • Easy to guess sequences ("AAAA", "AAAB", ...)

I was able to roll my own in 20 lines of python, thanks everybody. My encoder leaves off 'I', 'L', 'O' and 'U', and the resulting sequences are hard to guess.

A: 
>>> l = '00010100011011101101110100010111'
>>> def splitSize(s, size):
...     return [''.join(x) for x in zip(*[list(s[t::size]) for t in range(size)])]
...  
>>> splitSize(l, 5)
['00010', '10001', '10111', '01101', '11010', '00101']
>>> 
pyfunc
+4  A: 
>>> a='00010100011011101101110100010111'
>>> [a[i:i+5] for i in range(0, len(a), 5)]
['00010', '10001', '10111', '01101', '11010', '00101', '11']
Greg Hewgill
+1 Like this solution the most, as it uses basic stuff and is quite readable.
Helper Method
A: 

Another way to group iterables, from the itertools examples:

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)
David Wolever
+3  A: 
>>> [''.join(each) for each in zip(*[iter(s)]*5)]
['00010', '10001', '10111', '01101', '11010', '00101']

or:

>>> map(''.join, zip(*[iter(s)]*5))
['00010', '10001', '10111', '01101', '11010', '00101']

[EDIT]

The question was raised by Greg Hewgill, what to do with the two trailing bits? Here are some possibilities:

>>> from itertools import izip_longest
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue=''))
['00010', '10001', '10111', '01101', '11010', '00101', '11']
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue=' '))
['00010', '10001', '10111', '01101', '11010', '00101', '11   ']
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue='0'))
['00010', '10001', '10111', '01101', '11010', '00101', '11000']
pillmuncher
+1  A: 

Per your comments, you actually want base 32 strings.

>>> import base64
>>> base64.b32encode("good stuff")
'M5XW6ZBAON2HKZTG'
Allen
A: 

How about using a regular expression?

>>> import re
>>> re.findall('.{1,5}', '00010100011011101101110100010111')
['00010', '10001', '10111', '01101', '11010', '00101', '11']

This will break though if your input string contains newlines, that you want in the grouping.

Peter Gibson