I need a compact representation of an array of booleans, does Python have a builtin bitfield type or will I need to find an alternate solution?
NumPy has a array interface module that you can use to make a bitfield.
The BitVector package may be what you need. It's not built in to my python installation, but easy to track down on the python site.
http://pypi.python.org/pypi/BitVector/1.5.1 is the current version as of today.
Bitarray was the best answer I found, when I recently had a similar need. It's a C extension (so much faster than BitVector, which is pure python) and stores its data in an actual bitfield (so it's eight times more memory efficient than a numpy boolean array, which appears to use a byte per element.)
If your bitfield is short, you can probably use the struct module. Otherwise I'd recommend some sort of a wrapper around the array module.
Also, the ctypes module does contain bitfields, but I've never used it myself. Caveat emptor.
I use the binary bit-wise operators !, &, |, ^, >>, and <<. They work really well and are implemented directly in the underlying C, which is usually directly on the underlying hardware.
Represent each of your values as a power of two:
testA = 2**0
testB = 2**1
testC = 2**3
Then to set a value true:
table = table | testB
To set a value false:
table = table & (~testC)
To test for a value:
bitfield_length = 0xff
if ((table & testB & bitfield_length) != 0):
print "Field B set"
Dig a little deeper into hexadecimal representation if this doesn't make sense to you. This is basically how you keep track of your boolean flags in an embedded C application as well (if you have limitted memory).
You should take a look at the bitstring module, which has recently reached version 2.0. The binary data is compactly stored as a byte array and can be easily created, modified and analysed.
You can create BitString
objects from binary, octal, hex, integers (big or little endian), strings, bytes, floats, files and more.
a = BitString('0xed44')
b = BitString('0b11010010')
c = BitString(int=100, length=14)
d = BitString('uintle:16=55, 0b110, 0o34')
e = BitString(bytes='hello')
f = pack('<2H, bin:3', 5, 17, '001')
You can then analyse and modify them with simple functions or slice notation - no need to worry about bit masks etc.
a.prepend('0b110')
if '0b11' in b:
c.reverse()
g = a.join([b, d, e])
g.replace('0b101', '0x3400ee1')
if g[14]:
del g[14:17]
else:
g[55:58] = 'uint:11=33, int:9=-1'
There is also a concept of a bit position, so that you can treat it like a file or stream if that's useful to you. Properties are used to give different interpretations of the bit data.
w = g.read(10).uint
x, y, z = g.readlist('int:4, int:4, hex:32')
if g.peek(8) == '0x00':
g.pos += 10
Plus there's support for the standard bit-wise binary operators, packing, unpacking, endianness and more. The latest version is for Python 2.6 to 3.1, and although it's pure Python it is reasonably well optimised in terms of memory and speed.
If you want to use ints (or long ints) to represent as arrays of bools (or as sets of integers), take a look at http://sourceforge.net/projects/pybitop/files/
It provides insert/extract of bitfields into long ints; finding the most-significant, or least-significant '1' bit; counting all the 1's; bit-reversal; stuff like that which is all possible in pure python but much faster in C.