views:

3249

answers:

3

Building on How Do You Express Binary Literals in Python, I was thinking about sensible, intuitive ways to do that Programming 101 chestnut of displaying integers in base-2 form. This is the best I came up with, but I'd like to replace it with a better algorithm, or at least one that should have screaming-fast performance.

def num_bin(N, places=8):
    def bit_at_p(N, p):
        ''' find the bit at place p for number n '''
        two_p = 1 << p   # 2 ^ p, using bitshift, will have exactly one
                         # bit set, at place p
        x = N & two_p    # binary composition, will be one where *both* numbers
                         # have a 1 at that bit.  this can only happen 
                         # at position p.  will yield  two_p if  N has a 1 at 
                         # bit p
        return int(x > 0)

    bits =  ( bit_at_p(N,x) for x in xrange(places))
    return "".join( (str(x) for x in bits) )

    # or, more consisely 
    # return "".join([str(int((N & 1 << x)>0)) for x in xrange(places)])
+1  A: 

Take a look at: Decimal to Binary Conversion (Python)

TK
That's nearly identical to one I have posted, except that it does the digits in highest bit first order, and adds a few guard conditions. Clearly this bit-shift method seems to be the idiom for how to do it in Python.
Gregg Lind
That linked code is confusingly named - there's nothing Denary (base 10) involved. Internally numbers are base 2, but conceptually they're just numbers, no base involved. Only *representations* of numbers have bases.
Brian
@Brian: The representation is exposed via literal, e.g. then you see `a = 10` in Python you know that `a` refers to an object that represents _decimal_ number 10 (in the same time an internal representation i.e, how bits are arranged in memory, is a private matter of the object).
J.F. Sebastian
It would be more accurate to say its the number represented by "10" in decimal, but once you leave the form of a string, its just a number with no base conceptually associated. It's indistinguishable from a number you create with an "0xa" literal.
Brian
The reason its relevant is that I'd expect something called Denary2Binary to actually a take a decimal value. eg. a decimal.Decimal object, which does have an explicit decimal representation.
Brian
+6  A: 

For best efficiency, you generally want to process more than a single bit at a time. You can use a simple method to get a fixed width binary representation. eg.

def _bin(x, width):
    return ''.join(str((x>>i)&1) for i in xrange(width-1,-1,-1))

_bin(x, 8) will now give a zero padded representation of x's lower 8 bits. This can be used to build a lookup table, allowing your converter to process 8 bits at a time (or more if you want to devote the memory to it).

_conv_table = [_bin(x,8) for x in range(256)]

Then you can use this in your real function, stripping off leading zeroes when returning it. I've also added handling for signed numbers, as without it you will get an infinite loop (Negative integers conceptually have an infinite number of set sign bits.)

def bin(x):
    if x == 0: 
        return '0' #Special case: Don't strip leading zero if no other digits
    elif x < 0:
        sign='-'
        x*=-1
    else:
        sign = ''
    l=[]
    while x:
        l.append(_conv_table[x & 0xff])
        x >>= 8
    return sign + ''.join(reversed(l)).lstrip("0")

[Edit] Changed code to handle signed integers.
[Edit2] Here are some timing figures of the various solutions. bin is the function above, constantin_bin is from Constantin's answer and num_bin is the original version. Out of curiosity, I also tried a 16 bit lookup table variant of the above (bin16 below), and tried out Python3's builtin bin() function. All timings were for 100000 runs using an 01010101 bit pattern.

Num Bits:              8       16       32       64      128      256
---------------------------------------------------------------------
bin                0.544    0.586    0.744    1.942    1.854    3.357 
bin16              0.542    0.494    0.592    0.773    1.150    1.886
constantin_bin     2.238    3.803    7.794   17.869   34.636   94.799
num_bin            3.712    5.693   12.086   32.566   67.523  128.565
Python3's bin      0.079    0.045    0.062    0.069    0.212    0.201

As you can see, when processing long values using large chunks really pays off, but nothing beats the low-level C code of python3's builtin (which bizarrely seems consistently faster at 256 bits than 128!). Using a 16 bit lookup table improves things, but probably isn't worth it unless you really need it, as it uses up a large chunk of memory, and can introduce a small but noticalbe startup delay to precompute the table.

Brian
Thanks for the more or less complete answer, which is use 2.6 or 3.0 bin() function :)
Gregg Lind
+2  A: 

Not screaming-fast, but straightforward:

>>> def bin(x):
...     sign = '-' if x < 0 else ''
...     x = abs(x)
...     bits = []
...     while x:
...             x, rmost = divmod(x, 2)
...             bits.append(rmost)
...     return sign + ''.join(str(b) for b in reversed(bits or [0]))

It is also faster than num_bin:

>>> import timeit
>>> t_bin = timeit.Timer('bin(0xf0)', 'from __main__ import bin')
>>> print t_bin.timeit(number=100000)
4.19453350997
>>> t_num_bin = timeit.Timer('num_bin(0xf0)', 'from __main__ import num_bin')
>>> print t_num_bin.timeit(number=100000)
4.70694716882

Even more, it actually works correctly (for my definition of "correctness" :)):

>>> bin(1)
'1'
>>> num_bin(1)
'10000000'
Constantin
"Correctly" is a matter of taste, depending on Little vs. Big-Endian byte order, and whether one wants padding or not. Good recipe though.
Gregg Lind