tags:

views:

125

answers:

3

I can think of some nasty inefficient ways to accomplish this task, but I'm wondering what the best way is.

For example I want to copy 10 bytes starting at the 3rd bit in a byte and copy to a pointer as usual.

Is there a better way than to copy one shifted byte at a time?

Thanks

A: 

This is the solution I coded, and started using.

void RightShiftMemCopy(uchar * pSource, uchar * pDest ,ushort len,uchar shiftOffset)
{
    ushort i=0;

    pDest+=(len-1);
    pSource+=(len-1);

    for(i=len-1;i != 0 ;--i)
    {
        *pDest = (*(pSource - 1) << 8 | *pSource) >> shiftOffset;

        --pDest;
        --pSource;
    }

    *pDest = *pSource >> shiftOffset;

}
Ryu
Don't call `memcpy()` to copy one byte. Just say `destData[i] = val`. Your use of `pSource` here is a poor choice of name. And its use in the `memcpy` call assumes a particular endianness.
RBerteig
+3  A: 

On x86 you the smallest unit you can access is a byte. However you can access 4 bytes at a time and work with 4 bytes at a time instead of one byte. For greater speeds you can use pslldq (SSE2). Of course, make sure you copies are aligned for maximum performance.

Alexandru
Has nothing to do with what the asker wanted. He wants to copy data that has sub-byte alignment, you're just telling him how to implement a worse `memcpy` than the one he already has.
SamB
@SamB: Not really. Bitshifting whole words at a time will be a lot faster than doing it a byte at a time.
R..
+4  A: 

The general approach is to read the source buffer as efficiently as possible, and shift it as required on the way to writing the destination buffer.

You don't have to do byte operations, you can always get the source reads long aligned for the bulk of the operation by doing up to three bytes at the beginning, and similarly handling the end since you shouldn't attempt to read past the stated source buffer length.

From the values read, you shift as required to get the bit alignment desired and assemble finished bytes for writing to the destination. You can also do the same optimization of writes to the widest aligned word size you can.

If you dig around in the source to a compression tool or library that makes extensive use of variable-width tokens (zlib, MPEG, TIFF, and JPEG all leap to mind) you will likely find sample code that treats an input or output buffer as a stream of bits that will have some implementation ideas to think about.

RBerteig