views:

36

answers:

1

I need help identifying the following number format.

For example, the following number format in MIB:

0x94 0x78 = 2680
0x94 0x78 in binary: [1001 0100] [0111 1000]

It seems that if the MSB is 1, it means another character follows it. And if it is 0, it is the end of the number.

So the value 2680 is [001 0100] [111 1000], formatted properly is [0000 1010] [0111 1000]

What is this number format called and what's a good way for computing this besides bit manipulation and shifting to a larger unsigned integer?

A: 

I have seen this called either 7bhm (7-bit has-more) or VLQ (variable length quantity); see http://en.wikipedia.org/wiki/Variable-length_quantity

This is stored big-endian (most significant byte first), as opposed to the C# BinaryReader.Read7BitEncodedInt method described at http://stackoverflow.com/questions/1550560/encoding-an-integer-in-7-bit-format-of-c-binaryreader-readstring

I am not aware of any method of decoding other than bit manipulation.

Sample PHP code can be found at http://php.net/manual/en/function.intval.php#62613

or in Python I would do something like

def encode_7bhm(i):
    o = [ chr(i & 0x7f) ]
    i /= 128

    while i > 0:
        o.insert(0, chr(0x80 | (i & 0x7f)))
        i /= 128

    return ''.join(o)


def decode_7bhm(s):
    o = 0

    for i in range(len(s)):
        v = ord(s[i])
        o = 128*o + (v & 0x7f)

        if v & 0x80 == 0:
            # found end of encoded value
            break
    else:
        # out of string, and end not found - error!
        raise TypeError

    return o
Hugh Bothwell