tags:

views:

133

answers:

4

I'm writing an application to parse certain network packets. A packet field contains the protocol version number in an octet, so that 4 high bits are the 'major' and low 4 are the 'minor' version. Currently I am parsing them as follows, but am wondering if there is a prettier or more 'pythonic' way of doing it:

    v = ord(data[17])
    major = (v & int('11110000',2) ) >> 4
    minor = v & int('00001111',2)
+2  A: 

You can write binary literals like this0b1111000

For your example I would proabbly use hex though

v = ord(data[17])
major = (v & 0xF0) >> 4
minor = (v & 0x0F)

You might also want to use the struct module to break the packet into its components

gnibbler
0v11110000 does not work as I am stuck with 2.4 python, but using hex makes sense, thanks
Kimvais
Oh yeah, and thanks for the struct tip!
Kimvais
+1  A: 

It would be neater to use literals instead of calling int. You can use binary literals or hex, for example:

major = (v & 0xf0) >> 4
minor = (v & 0x0f)

Binary literals only work for Python 2.6 or later and are of the form 0b11110000. If you are using Python 2.6 or later then you might want to look at the bytearray type as this will let you treat the data as binary and so not have to use the call to ord.

If you are parsing binary data and finding that you are having to do lots of bit wise manipulations then you might like to try a more general solution as there are some third-party module that specialise in this. One is hachoir, and a lower level alternative is bitstring (which I wrote). In this your parsing would become something like:

major, minor = data.readlist('uint:4, uint:4')

which can be easier to manage if you're doing a lot of such reads.

Scott Griffiths
A: 

Just one hint, you'd better apply the mask for the major after shifting, in case it's a negative number and the sign is kept:

major = (v >> 4) & 0x0f
minor = (v & 0x0f)
fortran
Your syntax seems invalid, and I think ord will only ever return a positive number.
Scott Griffiths
I'm glad to see that you're good spotting typos... when playing with bitmasks it's better to make sure that your code is robust against eventualities like an interface change (I'm not expecting ord to change, but it's not unlikely that this snippet could be refactored and having a bug caused by a sign expansion while shifting left is harder to spot than an invalid syntax)
fortran
Scott Griffiths
I meant shift _right_... I'm not paying much attention. It's true that in Python that is less prone to happen (as it automatically upcasts to longints and crops the sign), but it's still a good practice to mask after shifting to keep the code meaningful in other languages, and in case of interfacing with C via CTypes, for example...
fortran
A: 

Well named functions are always a good way to hide ugliness and irrelevant complexity. This way the bit-fiddling is confined to small and easily proven correct functions while the higher level code can refer to the purpose of the fiddling.

def high_nibble(byte):
    """Get 4 high order bits from a byte."""
    return (byte >> 4) & 0xF

def low_nibble(byte):
    """Get 4 low order bits from a byte."""
    return byte & 0xF

def parse_version(version_byte):
    """Get the major-minor version tuple from the version byte."""
    return high_nibble(version_byte), low_nibble(version_byte)

major, minor = parse_version(version_byte)
Ants Aasma