views:

614

answers:

6

For a FFT function I need to permutate or shuffle the elements within an array in a bit-reversed way. That's a common task with FFTs because most power of two sized FFT functions either expect or return their data in a bit-reversed way.

E.g. assume that the array has 256 elements I'd like to swap each element with it's bit-reversed pattern. Here are two examples (in binary):

Element 00000001b should be swapped with element 10000000b
Element 00010111b should be swapped with element 11101000b

and so on.

Any idea how to do this fast and more important: in-place?

I already have a function that does this swap. It's not hard to write one. Since this is such a common operation in DSP I have the feeling that there are more clever ways to do it than my very naiive loop.

Language in question is C, but any language is fine.

A: 

Using a pre-built lookup table to do the mapping seems to be the obvious solution. I guess it depends how big the arrays you will be dealing with are. But even if a direct mapping is not possible, I'd still go for a lookup table, maybe of byte-size patterns that you can use to build the word-sized pattern for the final index.

anon
I'd go for nibble size and then swap nibbles, since it's a common assembly command.
SurDin
+5  A: 

A quick way to do this is to swap every adjacent single bit, then 2-bit fields, etc. The fast way to do this is:

x = (x & 0x55) << 1 | (x & 0xAA) >> 1; //swaps bits
x = (x & 0x33) << 2 | (x & 0xCC) >> 2; //swapss 2-bit fields
x = (x & 0x0F) << 4 | (x & 0xF0) >> 4;

While hard to read, if this is something that needs to be optimized you may want to do it this way.

Nathaniel Flath
This is probably a good trick when you are writing in Assembly and want to swap data around in a 32/64 bit register. When you write such code in C that will be optimized, i feel that the compiler will be forced to use two register variables for the bit-shift operations and end up with less optimal code.Meanwhile, this does not seem to be the crux of Nils' question. He wants to get the swap algorithm for the specified addresses in an array.
nik
Since time-optimization is not something asked for here, i think this way to perform a bit-reverse is quite good for the case. Particularly, when space-optimization is of interest.
nik
A: 

Element 00000001b should be swapped with element 10000000b

I think you mean "Element 00000001b should be swapped with element 11111110b" in the first line?

Instead of awapping 256 bytes you could cast the array to (long long*) and swap 32 "long long" values instead, that should be much faster on 64 bit machines (or use 64 long values on a 32 bit machine).

Secondly if you naively run through the array and swap all values with its complement than you will swap all elements twice, so you have done nothing at all :-) So you first have to identity which are the complements and leave them out of your loop.

codymanix
No, I really want to swap bit-reversed elements (e.g. changing the bit-reading order from left->right to right->left is really what I want). Good point on avoiding to swap the elements twice though. Keeping track of this is one of the things that makes my algorithm slow.
Nils Pipenbrinck
+4  A: 

This code uses a lookup table to reverse 64-bit numbers very quickly. For your C-language example, I also included versions for 32-, 16-, and 8-bit numbers (assumes int is 32 bits). In an object-oriented language (C++, C#, etc), I would have just overloaded the function.

I don't have a C-compiler handy at the moment so, hopefully, I didn't miss anything.

unsigned char ReverseBits[] = 
{
  0x00, 0x80, 0x40, 0xC0, 0x20, 0xA0, 0x60, 0xE0, 0x10, 0x90, 0x50, 0xD0, 0x30, 0xB0, 0x70, 0xF0, 
  0x08, 0x88, 0x48, 0xC8, 0x28, 0xA8, 0x68, 0xE8, 0x18, 0x98, 0x58, 0xD8, 0x38, 0xB8, 0x78, 0xF8, 
  0x04, 0x84, 0x44, 0xC4, 0x24, 0xA4, 0x64, 0xE4, 0x14, 0x94, 0x54, 0xD4, 0x34, 0xB4, 0x74, 0xF4, 
  0x0C, 0x8C, 0x4C, 0xCC, 0x2C, 0xAC, 0x6C, 0xEC, 0x1C, 0x9C, 0x5C, 0xDC, 0x3C, 0xBC, 0x7C, 0xFC, 
  0x02, 0x82, 0x42, 0xC2, 0x22, 0xA2, 0x62, 0xE2, 0x12, 0x92, 0x52, 0xD2, 0x32, 0xB2, 0x72, 0xF2, 
  0x0A, 0x8A, 0x4A, 0xCA, 0x2A, 0xAA, 0x6A, 0xEA, 0x1A, 0x9A, 0x5A, 0xDA, 0x3A, 0xBA, 0x7A, 0xFA,
  0x06, 0x86, 0x46, 0xC6, 0x26, 0xA6, 0x66, 0xE6, 0x16, 0x96, 0x56, 0xD6, 0x36, 0xB6, 0x76, 0xF6, 
  0x0E, 0x8E, 0x4E, 0xCE, 0x2E, 0xAE, 0x6E, 0xEE, 0x1E, 0x9E, 0x5E, 0xDE, 0x3E, 0xBE, 0x7E, 0xFE,
  0x01, 0x81, 0x41, 0xC1, 0x21, 0xA1, 0x61, 0xE1, 0x11, 0x91, 0x51, 0xD1, 0x31, 0xB1, 0x71, 0xF1,
  0x09, 0x89, 0x49, 0xC9, 0x29, 0xA9, 0x69, 0xE9, 0x19, 0x99, 0x59, 0xD9, 0x39, 0xB9, 0x79, 0xF9, 
  0x05, 0x85, 0x45, 0xC5, 0x25, 0xA5, 0x65, 0xE5, 0x15, 0x95, 0x55, 0xD5, 0x35, 0xB5, 0x75, 0xF5,
  0x0D, 0x8D, 0x4D, 0xCD, 0x2D, 0xAD, 0x6D, 0xED, 0x1D, 0x9D, 0x5D, 0xDD, 0x3D, 0xBD, 0x7D, 0xFD,
  0x03, 0x83, 0x43, 0xC3, 0x23, 0xA3, 0x63, 0xE3, 0x13, 0x93, 0x53, 0xD3, 0x33, 0xB3, 0x73, 0xF3, 
  0x0B, 0x8B, 0x4B, 0xCB, 0x2B, 0xAB, 0x6B, 0xEB, 0x1B, 0x9B, 0x5B, 0xDB, 0x3B, 0xBB, 0x7B, 0xFB,
  0x07, 0x87, 0x47, 0xC7, 0x27, 0xA7, 0x67, 0xE7, 0x17, 0x97, 0x57, 0xD7, 0x37, 0xB7, 0x77, 0xF7, 
  0x0F, 0x8F, 0x4F, 0xCF, 0x2F, 0xAF, 0x6F, 0xEF, 0x1F, 0x9F, 0x5F, 0xDF, 0x3F, 0xBF, 0x7F, 0xFF
};


unsigned long Reverse64Bits(unsigned long number)
{    
    unsigned long result;

    result = 
        (ReverseBits[ number        & 0xff] << 56) |
        (ReverseBits[(number >>  8) & 0xff] << 48) | 
        (ReverseBits[(number >> 16) & 0xff] << 40) | 
        (ReverseBits[(number >> 24) & 0xff] << 32) | 
        (ReverseBits[(number >> 32) & 0xff] << 24) |
        (ReverseBits[(number >> 40) & 0xff] << 16) | 
        (ReverseBits[(number >> 48) & 0xff] <<  8) | 
        (ReverseBits[(number >> 56) & 0xff]);

    return result;
}

unsigned int Reverse32Bits(unsigned int number)
{
    unsigned int result;

    result = 
        (ReverseBits[ number        & 0xff] << 24) |
        (ReverseBits[(number >>  8) & 0xff] << 16) | 
        (ReverseBits[(number >> 16) & 0xff] <<  8) | 
        (ReverseBits[(number >> 24) & 0xff]);

    return result;
}

unsigned short Reverse16Bits(unsigned short number)
{
    unsigned short result;

    result = 
        (ReverseBits[ number       & 0xff] <<  8) | 
        (ReverseBits[(number >> 8) & 0xff]);

    return result;
}

unsigned char Reverse8Bits(unsigned char number)
{
    unsigned char result;

    result = (ReverseBits[number]);

    return result;
}

Enjoy,

Robert C. Cartaino

Robert Cartaino
+2  A: 

To swap in place with a single pass, iterate once through all elements in increasing index. Perform a swap only if the index is less-than the reversed index -- this will skip the double swap problem and also palindrome cases (elements 00000000b, 10000001b, 10100101b) which inverse to the same value and no swap is required.

// Let data[256] be your element array 
for (i=0; i<256; i++)
    j = bit_reverse(i);
    if (i < j)
    {
        swap(data[i],data[j]);
    }

The bit_reverse() can be using Nathaneil's bit-operations trick. The bit_reverse() will be called 256 times but the swap() will be called less than 128 times.

nik
+2  A: 

If you think about what's happening to the bitswapped index, it's being counted up in the same way that the non-bitswapped index is being counted up, just with the bits being used in the reverse order from conventional counting.

Rather than bitswapping the index every time through the loop you can manually implement a '++' equivalent that uses bits in the wrong order to do a double indexed for loop. I've verified that gcc at O3 inlines the increment function, but as to whether it's any faster then bitswapping the number via a lookup every time, that's for the profiler to say.

Here's an illustrative test program.

#include <stdio.h>

void RevBitIncr( int *n, int bit )
{
    do
    {
        bit >>= 1;
        *n ^= bit;
    } while( (*n & bit) == 0 && bit != 1 );
}

int main(void)
{
    int max = 0x100;
    int i, j;

    for( i = 0, j = 0; i != max; ++i, RevBitIncr( &j, max ) )
    {
        if( i < j )
            printf( "%02x <-> %02x\n", i, j );
    }

    return 0;
}
Charles Bailey