The struct.pack() function allows converting integers of up to 64 bit to byte strings. What's the most efficient way to pack an even larger integer? I'd rather not add a dependency on non-standard modules like PyCrypto (which provides num_to_bytes()).
As suggested by S.Lott in a comment, just convert the number to a string and pack that string. For example,
x = 2 ** 12345
struct.pack("40s", str(x))
Assuming the poster wants to pack a large integer as a binary string, i.e. not use one byte of storage per digit in the number. One way of doing this seems to be:
import marshal
a = 47L
print marshal.dumps(a)
This prints:
'l\x01\x00\x00\x00/\x00'
I can't say that I understand how to interpret these bits, right now ...
I take it you mean you only want to use as many bytes as you need to represent the number? e.g. if the number is:
- 255 or less you'd use only 1 byte
- 65535 or less 2 bytes
- 16777215 or less 3 bytes
- etc etc
On the Psion PDA they'd usually have some of packing scheme in which you read the first byte, detect if it has the highest bit set and then read another byte if it has. That way you'd just keep reading bytes until you read the "full" number. That system works quite well if most of the numbers you are dealing with are fairly small, as you'll normally only use one or two bytes per number.
The alternative is to have one (or more) bytes representing the number of total bytes used, but at that point it's basically a string in Python anyway. i.e. it's a string of base-256 digits.
Do you mean something like this:
def num_to_bytes(num):
bytes = []
num = abs(num) # Because I am unsure about negatives...
while num > 0:
bytes.append(chr(num % 256))
num >>= 8
return ''.join(reversed(bytes))
def bytes_to_num(bytes):
num = 0
for byte in bytes:
num <<= 8
num += ord(byte)
return num
for n in (1, 16, 256, 257, 1234567890987654321):
print n,
print num_to_bytes(n).encode('hex'),
print bytes_to_num(num_to_bytes(n))
Which returns:
1 01 1
16 10 16
256 0100 256
257 0101 257
1234567890987654321 112210f4b16c1cb1 1234567890987654321
I'm just not sure what to do about negatives... I'm not that familiar with bit twidling.
EDIT: Another solution (which runs about 30% faster by my tests):
def num_to_bytes(num):
num = hex(num)[2:].rstrip('L')
if len(num) % 2:
return ('0%s' % num).decode('hex')
return num.decode('hex')
def bytes_to_num(bytes):
return int(bytes.encode('hex'), 16)
This is a bit hacky, but you could go via the hex string representation, and there to binary with the hex codec:
>>> a = 2**60
>>> a
1152921504606846976L
>>> hex(a)
'0x1000000000000000L'
>>> hex(a).rstrip("L")[2:].decode('hex')
'\x10\x00\x00\x00\x00\x00\x00\x00' # 8bytes, as expected.
>>> int(_.encode('hex'), 16)
1152921504606846976L
It breaks a little because the hex codec requires an even number of digits, so you'll need to pad for that, and you'll need to set a flag to handle negative numbers. Here's a generic pack / unpack:
def pack(num):
if num <0:
num = (abs(num) << 1) | 1 # Hack - encode sign as lowest bit.
else:
num = num << 1
hexval = hex(num).rstrip("L")[2:]
if len(hexval)%2 ==1: hexval = '0' + hexval
return hexval.decode('hex')
def unpack(s):
val = int(s.encode('hex'), 16)
sign = -1 if (val & 1) else 1
return sign * (val>>1)
for i in [10,4534,23467, 93485093485, 2**50, 2**60-1, -1, -20, -2**60]:
assert unpack(pack(i)) == i
With all the fiddling for padding etc required, I'm not sure it's much better than a hand-rolled solution though.