views:

1987

answers:

8

I have n elements. For the sake of an example, let's say, 7 elements, 1234567. I know there are 7! = 5040 permutations possible of these 7 elements.

I want a fast algorithm comprising two functions:

f(number) maps a number between 0 and 5039 to a unique permutation, and

f'(permutation) maps the permutation back to the number that it was generated from.

I don't care about the correspondence between number and permutation, providing each permutation has its own unique number.

So, for instance, I might have functions where

f(0) = '1234567'
f'('1234567') = 0

The fastest algorithm that comes to mind is to enumerate all permutations and create a lookup table in both directions, so that, once the tables are created, f(0) would be O(1) and f('1234567') would be a lookup on a string. However, this is memory hungry, particularly when n becomes large.

Can anyone propose another algorithm that would work quickly and without the memory disadvantage?

+3  A: 

Each element can be in one of seven positions. To describe the position of one element, you would need three bits. That means you can store the position of all the elements in a 32bit value. That's far from being efficient, since this representation would even allow all elements to be in the same position, but I believe the bit-masking should be reasonably fast.

However, with more than 8 positions you'll need something more nifty.

sbi
This assumes that the OP doesn't care if the enumeration actually goes from 0 to 5039, right? If that's okay then this seems like an excellent solution.
Troubadour
A: 

What an interesting question!

If all of your elements are numbers, you might want to consider converting them from strings to actual numbers. Then you would be able to sort all of the permutations by putting them in order, and place them in an array. After that, you would be open to any of the various searching algorithms out there.

Shaka
+2  A: 

This happens to be a built-in function in J:

   A. 1 2 3 4 5 6 7
0
   0 A. 1 2 3 4 5 6 7
1 2 3 4 5 6 7

   ?!7
5011
   5011 A. 1 2 3 4 5 6 7
7 6 4 5 1 3 2
   A. 7 6 4 5 1 3 2
5011
ephemient
+29  A: 

To describe a permutation of n elements, you see that for the position that the first element ends up at, you have n possibilities, so you can describe this with a number between 0 and n-1. For the position that the next element ends up at, you have n-1 remaining possibilities, so you can describe this with a number between 0 and n-2.
Et cetera until you have n numbers.

As an example for n = 5, consider the permutation that brings abcde to caebd.

  • a, the first element, ends up at the second position, so we assign it index 1.
  • b ends up at the fourth position, which would be index 3, but it's the third remaining one, so we assign it 2.
  • c ends up at the first remaining position, which is always 0.
  • d ends up at the last remaining position, which (out of only two remaining positions) is 1.
  • e ends up at the only remaining position, indexed at 0.

So we have the index sequence {1, 2, 0, 1, 0}.

Now you know that for instance in a binary number, 'xyz' means z + 2y + 4x. For a decimal number,
it's z + 10y + 100x. Each digit is multiplied by some weight, and the results are summed. The obvious pattern in the weight is of course that the weight is w = b^k, with b the base of the number and k the index of the digit. (I will always count digits from the right and starting at index 0 for the rightmost digit. Likewise when I talk about the 'first' digit I mean the rightmost.)

The reason why the weights for digits follow this pattern is that the highest number that can be represented by the digits from 0 to k must be exactly 1 lower than the lowest number that can be represented by only using digit k+1. In binary, 0111 must be one lower than 1000. In decimal, 099999 must be one lower than 100000.

Encoding to variable-base
The spacing between subsequent numbers being exactly 1 is the important rule. Realising this, we can represent our index sequence by a variable-base number. The base for each digit is the amount of different possibilities for that digit. For decimal each digit has 10 possibilities, for our system the rightmost digit would have 1 possibility and the leftmost will have n possibilities. But since the rightmost digit (the last number in our sequence) is always 0, we leave it out. That means we're left with bases 2 to n. In general, the k'th digit will have base b[k] = k + 2. The highest value allowed for digit k is h[k] = b[k] - 1 = k + 1.

Our rule about the weights w[k] of digits requires that the sum of h[i] * w[i], where i goes from i = 0 to i = k, is equal to 1 * w[k+1]. Stated recurrently, w[k+1] = w[k] + h[k] * w[k] = w[k]*(h[k] + 1). The first weight w[0] should always be 1. Starting from there, we have the following values:

k    h[k] w[k]    

0    1    1  
1    2    2    
2    3    6    
3    4    24   
...  ...  ...
n-1  n    n!  

(The general relation w[k-1] = k! is easily proved by induction.)

The number we get from converting our sequence will then be the sum of s[k] * w[k], with k running from 0 to n-1. Here s[k] is the k'th (rightmost, starting at 0) element of the sequence. As an example, take our {1, 2, 0, 1, 0}, with the rightmost element stripped off as mentioned before: {1, 2, 0, 1}. Our sum is 1 * 1 + 0 * 2 + 2 * 6 + 1 * 24 = 37.

Note that if we take the maximum position for every index, we'd have {4, 3, 2, 1, 0}, and that converts to 119. Since the weights in our number encoding were chosen so that we don't skip any numbers, all numbers 0 to 119 are valid. There are precisely 120 of these, which is n! for n = 5 in our example, precisely the number of different permutations. So you can see our encoded numbers completely specify all possible permutations.

Decoding from variable-base
Decoding is similar to converting to binary or decimal. The common algorithm is this:

int number = 42;
int base = 2;
int[] bits = new int[n];

for (int k = 0; k < bits.Length; k++)
{
    bits[k] = number % base;
    number = number / base;
}

For our variable-base number:

int n = 5;
int number = 37;

int[] sequence = new int[n - 1];
int base = 2;

for (int k = 0; k < sequence.Length; k++)
{
    sequence[k] = number % base;
    number = number / base;

    base++; // b[k+1] = b[k] + 1
}

This correctly decodes our 37 back to {1, 2, 0, 1} (sequence would be {1, 0, 2, 1} in this code example, but whatever ... as long as you index appropriately). We just need to add 0 at the right end (remember the last element always has only one possibility for its new position) to get back our original sequence {1, 2, 0, 1, 0}.

Permuting a list using an index sequence
You can use the below algorithm to permute a list according to a specific index sequence. It's an O(n²) algorithm, unfortunately.

int n = 5;
int[] sequence = new int[] { 1, 2, 0, 1, 0 };
char[] list = new char[] { 'a', 'b', 'c', 'd', 'e' };
char[] permuted = new char[n];
bool[] set = new bool[n];

for (int i = 0; i < n; i++)
{
    int s = sequence[i];
    int remainingPosition = 0;
    int index;

    // Find the s'th position in the permuted list that has not been set yet.
    for (index = 0; index < n; index++)
    {
        if (!set[index])
        {
            if (remainingPosition == s)
                break;

            remainingPosition++;
        }
    }

    permuted[index] = list[i];
    set[index] = true;
}

Common representation of permutations
Normally you would not represent a permutation as unintuitively as we've done, but simply by the absolute position of each element after the permutation is applied. Our example {1, 2, 0, 1, 0} for abcde to caebd is normally represented by {1, 3, 0, 4, 2}. Each index from 0 to 4 (or in general, 0 to n-1) occurs exactly once in this representation.

Applying a permutation in this form is easy:

int[] permutation = new int[] { 1, 3, 0, 4, 2 };

char[] list = new char[] { 'a', 'b', 'c', 'd', 'e' };
char[] permuted = new char[n];

for (int i = 0; i < n; i++)
{
    permuted[permutation[i]] = list[i];
}

Inverting it is very similar:

for (int i = 0; i < n; i++)
{
    list[i] = permuted[permutation[i]];
}

Converting from our representation to the common representation
Note that if we take our algorithm to permute a list using our index sequence, and apply it to the identity permutation {0, 1, 2, ..., n-1}, we get the inverse permutation, represented in the common form. ({2, 0, 4, 1, 3} in our example).

To get the non-inverted premutation, we apply the permutation algorithm I just showed:

int[] identity = new int[] { 0, 1, 2, 3, 4 };
int[] inverted = { 2, 0, 4, 1, 3 };
int[] normal = new int[n];

for (int i = 0; i < n; i++)
{
    normal[identity[i]] = list[i];
}

Or you can just apply the permutation directly, by using the inverse permutation algorithm:

char[] list = new char[] { 'a', 'b', 'c', 'd', 'e' };
char[] permuted = new char[n];

int[] inverted = { 2, 0, 4, 1, 3 };

for (int i = 0; i < n; i++)
{
    permuted[i] = list[inverted[i]];
}

Note that all the algorithms for dealing with permutations in the common form are O(n), while applying a permutation in our form is O(n²). If you need to apply a permutation several times, first convert it to the common representation.

Joren
awesome answer! +1
Paul Hollingsworth
In "Permuting a list using an index sequence", you mention a quadratic algorithm. This is certainly fine because n is probably going to be very small. This can "easily" be reduced to O(nlogn) though, through an order statistics tree (http://pine.cs.yale.edu/pinewiki/OrderStatisticsTree), i.e. a red-black tree which initially will contains the values 0, 1, 2, ..., n-1, and each node contains the number of descendants below it. With this, one can find/remove the kth element in O(logn) time.
Dimitris Andreou
A: 

I was hasty in my previous answer (deleted), I do have the actual answer though. It is provided by a similar concept, the factoradic, and is related to permutations (my answer related to combinations, I apologize for that confusion). I hate to just post wikipedia links, but I writeup I did awhile ago is unintelligible for some reason. So, I can expand on this later if requested.

nlucaroni
A: 

You can encode permutations using a recursive algorithm. If a N-permutation (some ordering of the numbers {0,..,N-1}) is of the form {x, ...} then encode it as x + N * the encoding of the (N-1)-permutation represented by "..." on the numbers {0, N-1} - {x}. Sounds like a mouthful, here's some code:

// perm[0]..perm[n-1] must contain the numbers in {0,..,n-1} in any order.
int permToNumber(int *perm, int n) {
  // base case
  if (n == 1) return 0;

  // fix up perm[1]..perm[n-1] to be a permutation on {0,..,n-2}.
  for (int i = 1; i < n; i++) {
    if (perm[i] > perm[0]) perm[i]--;
  }

  // recursively compute
  return perm[0] + n * permToNumber(perm + 1, n - 1);
}

// number must be >=0, < n!
void numberToPerm(int number, int *perm, int n) {
  if (n == 1) {
    perm[0] = 0;
    return;
  }
  perm[0] = number % n;
  numberToPerm(number / n, perm + 1, n - 1);

  // fix up perm[1] .. perm[n-1]
  for (int i = 1; i < n; i++) {
    if (perm[i] >= perm[0]) perm[i]++;
  }
}

This algorithm is O(n^2). Bonus points if anyone has an O(n) algorithm.

Keith Randall
A: 

There is a book written about this. Sorry, but I do not remember the name of it (you will find it quite probably from wikipedia). but anyway I wrote a python implementation of that enumeration system: http://kks.cabal.fi/Kombinaattori Some of it is in Finnish, but just copy the code and name variables ...

kummahiih
A: 

The complexity can be brought down to n*log(n), see section 10.1.1 ("The Lehmer code (inversion table)", p.232ff) of the fxtbook: http://www.jjj.de/fxt/#fxtbook skip to section 10.1.1.1 ("Computation with large arrays" p.235) for the fast method. The (GPLed, C++) code is on the same web page.