tags:

views:

525

answers:

2

I was facing this unique problem of generating a bit-mask based on the input parameter. For example,

if param = 2, then the mask will be 0x3 (11b) if param = 5, then the mask will be 0x1F (1 1111b)

This I implemented using a for-loop in C, something like

int nMask = 0;
for (int i = 0; i < param; i ++) {

    nMask |= (1 << i);
}

I would like to know if there is a better algorithm ~~~

+16  A: 

One thing to notice about bitmasks like that is that they are always one less than a power of two.

The expression "1 << n" is the easy way to get the n-th power of two.

You don't want Zero to provide a bitmask of "00000001", you want it to provide zero. So you need to subtract one.

mask = (1 << param) - 1;

Edit:

If you want a special case for param > 32:

int sizeInBits = sizeof(mask) * BITS_PER_BYTE; // BITS_PER_BYTE = 8;
mask = (param >= sizeInBits ? -1 : (1 <<  param) - 1);

This method should work for 16, 32, or 64 bit integers, but you may have to explicitly type the '1'.

John Gietzen
Nice idea, subtracting to get all ones :)
AraK
Thanks. I figured it out when I was building binary trees for a single elimination bracket.
John Gietzen
This is the canonical solution, with two caveats. First, you should probably be using `unsigned int` for `mask` and `1U` as the left side of the shift operator, and secondly be aware that the result is unspecified if `param` is equal or greater than the number of bits in `int` (or one less than the number of bits, if you continue to use signed math). If this is a problem in your environment, use a lookup table instead.
caf
@caf: Good points, but a lookup table won't fix things if there aren't enough bits in your environment's unsigned type!
j_random_hacker
A lookup table will work for the case of param *equal* to the number of bits in `unsigned int` (the "all ones" case), though.
caf
Right. Will the memory look-up be faster than the bit-wise arithmetic and subtraction?
John Gietzen
@caf: True, but since that's the only case where the lookup table would be an improvement, I'd probably just special-case it with an `if (param == sizeof (unsigned) * CHAR_BIT) { mask = (unsigned) -1; }` instead, which is guaranteed (in C++ at least) to give you all-bits-on.
j_random_hacker
Also, while the C++ standard strictly says that the case `param == width_of_unsigned_in_bits` produces Undefined Behaviour for left-shift, it would be very surprising to meet an implementation that did not just produce the value 0 in this case. So in practice I wouldn't actually bother with the special case `if`, since the mainline code handles it fine.
j_random_hacker
The reason it says it is undefined is because it is usually translated directly a processor op, and there can be no guarantees made on all processors. I expect at least one processor somewhere produces a different result for left shift. But on x86 and compatible processors, I think you can be safe to say that the main line code will work.
John Gietzen
Have you actually tested it? On my x86, under gcc, it produces zero as the mask for `param = 32`, not all-ones (because the x86 shift actually shifts by param modulo 32). I don't think the lookup table would be significantly slower in most cases.
caf
Ah!, didn't realize that. No I had not tested it.
John Gietzen
@caf: Interesting! Same behaviour on MSVC++9 on x86. I stand corrected. My instinct is still to go with the special-case `if` test, but a lookup table should be roughly as fast.
j_random_hacker
+3  A: 

For those interested, this is the lookup-table alternative discussed in comments to the other answer - the difference being that it works correctly for a param of 32. It's easy enough to extend to the 64 bit unsigned long long version, if you need that, and shouldn't be significantly different in speed (if it's called in a tight inner loop then the static table will stay in at least L2 cache, and if it's not called in a tight inner loop then the performance difference won't be important).

unsigned long mask2(unsigned param)
{
    static const unsigned long masks[] = {
        0x00000000UL, 0x00000001UL, 0x00000003UL, 0x00000007UL,
        0x0000000fUL, 0x0000001fUL, 0x0000003fUL, 0x0000007fUL,
        0x000000ffUL, 0x000001ffUL, 0x000003ffUL, 0x000007ffUL,
        0x00000fffUL, 0x00001fffUL, 0x00003fffUL, 0x00007fffUL,
        0x0000ffffUL, 0x0001ffffUL, 0x0003ffffUL, 0x0007ffffUL,
        0x000fffffUL, 0x001fffffUL, 0x003fffffUL, 0x007fffffUL,
        0x00ffffffUL, 0x01ffffffUL, 0x03ffffffUL, 0x07ffffffUL,
        0x0fffffffUL, 0x1fffffffUL, 0x3fffffffUL, 0x7fffffffUL,
        0xffffffffUL };

    if (param < (sizeof masks / sizeof masks[0]))
        return masks[param];
    else
        return 0xffffffffUL; /* Or whatever else you want to do in this error case */
}

It's worth pointing out that if you need the if() statement (because are worried that someone might call it with param > 32), then this doesn't win you anything over the alternative from the other answer:

unsigned long mask(unsigned param)
{
    if (param < 32)
        return (1UL << param) - 1;
    else
        return -1;
}

The only difference is that the latter version has to special case param >= 32, whereas the former only has to special case param > 32.

caf