views:

1057

answers:

5

I had a interview today where they asked me to write two "C" functions, one to to extract a single bit and other to extract a range of bits from a character. I took a while and came up with these methods.

int extractBit(char byte, int pos) {
    assert( (pos >= 0) && (pos < 8) );
    return ( ( byte & (1<<pos) ) >> pos);
}
char extractBitRange(char byte, int startingPos, int offset) {
   assert( ( (startingPos + offset) >= 0) && ( (startingPos + offset) < 8) );
   return ( byte >> startingPos ) & ~(0xff << (offset + 1));
}

But the interviewer kept asking me if I could speed up the code further (in terms of cpu cycles) and if there is any scope of optimization that I could do to achieve it. I was clearly out of sorts and I am curious to know how would you do this?

+14  A: 

In extractBit, if you shift first, you can mask with 1 instead of (1<<pos). Considering that pos is an argument of the function, that saves a computation.

return (byte >> pos) & 1;

In the second function, I would assert that startingPos and offset are both positive instead of asserting that their sum is positive, it makes more sense that way.

Pascal Cuoq
I'd just make them `unsigned int`s.
Pete Kirkham
Yup.For extractBit first thought was a lookup table, but this will probably be more efficient on modern CPUs. You could also declare the function as inline in C++ or C99, which will save the function call overhead.As Pete says, you could make the args unsigned ints. Or you could cast the test values that way in the asserts and eliminate the tests against zero - e.g. assert( ((unsigned int)(startingPos + offset)) < 8) ); - negative values will turn into very big positive values, and it really just turns into a slightly different machine-language comparison opcode.
Bob Murphy
+6  A: 

A look up table?

For the single bit extraction, you would need a table of 256 entries for each of 8 possible bits - which is a 2 KB table if stored in characters (256 bytes if you compress everything and use bit operations to get the values out - but then you're back to where you started). For the ranges, you couldn't sensibly define tables for all 36 possible ranges of bits. Of course, if you have some other structure than a lookup table indexed by byte value and bit position (or bit range), then there may be alternatives, but you'd have to explain that.
Jonathan Leffler
+2  A: 

Another one you do in range of bits:


~(0xff << (offset + 1))
-->
~(0xfe << offset)

As << 1 is nothing more then *2, you can make this operation on your constant (which if you are operating on signle bytes is just getting rid of LSB).

Ravadre
Neat trick. Another way would be to inject `(8-offset)` zeros at the left by something like `(unsigned int)0xff >> (8 - offset)`, this means there is still an arithmetic operation on `offset` but it saves the 1-complement operation.
Pascal Cuoq
A: 

You can speed up the first function by first shifting to the right and then masking the bit:

int extractBit(char byte, int pos) {
   return (byte >> pos) & 0x01;
}

This saves you one operation.

For the second question, I assume that startingPos is the first bit of the chunk you want to extract and offset is how many bits in the chunk you need. Then you could use this:

char extractBitRange(char byte, int startingPos, int offset) {
   return (byte >> startingPos) & ((1 << offset)-1);
}

Of course you must be careful about the ranges, just as you did in your code.

EDIT: if you want extractBitRange(b,i,0) to behave like extractBit(b,i) and extract a single bit at position i, this variant does that:

return (byte >> startingPos) & ((2 << offset) - 1);
Krystian
rajachan
Krystian
A: 

If you want to get really quick, you can use a lookup table. I'm guessing that that's what the interviewer was going for (as a final answer to "how can I make this faster").

Basically, that means you create in advance a huge table, mapping every possible combination of parameters to the proper result. For example, you'd have:

byte = 0x0, pos = 0, result = 0
byte = 0x0, pos = 1, result = 0
...
byte = 0x1, pos = 0, result = 1

Obviously this would need to be put into valid c data structues (arrays, indexed by the byte and pos). This would allow you to, in your function, just return a place in an array, based on whatever indexing scheme you choose.

For the first function, this wouldn't take up too much memory. We're talking about a byte's worth of values (a byte can have 256 different values) times 8 possible values for starting pos, which makes an array of 2048.

For the second function, this would actually take a lot more space. You'd need to multiply 256 times all the possible values for both starting and ending pos (keeping in mind that there are illegal combinations of starting and ending pos).

I'm guessing the interviewer just wanted you to answer that this would be a way to speed it up, and then supply the above thinking into trying to estimate how much space it would cost versus time saved.

Edan Maor