views:

93

answers:

4

I'm looking for an algorithm to generate all permutations with repetition of 4 elements in list(length 2-1000).

Java implementation

The problem is that the algorithm from the link above alocates too much memory for calculation. It creates an array with length of all possible combination. E.g 4^1000 for my example. So i got heap space exception.

Thank you

+2  A: 

Generalized algorithm for lazily-evaluated generation of all permutations (with repetition) of length X for a set of choices Y:

for I = 0 to (Y^X - 1):
    list_of_digits = calculate the digits of I in base Y
    a_set_of_choices = possible_choices[D] for each digit D in list_of_digits
    yield a_set_of_choices 
Amber
+1: same idea as I but general case
kriss
A: 

So you should to use recirsive alghoritm:

http://saj.in/recursive-permutations-generator-in-c/

http://codesnip.net/recursive-permutations-generator-csharp

http://www.codeguru.com/cpp/cpp/algorithms/article.php/c5123

UGEEN
so what with recursivity ? Using space on the stack is not better than stack anywhere else (especially for Java that does not perform Tail-Call Optimization). And this particular question does not require using large amount of memory when implemented iteratively. Also the provided links seems to be about permutations without repetition, that is a different problem.
kriss
+2  A: 

If there is not length limit for repetition of your 4 symbols there is a very simple algorithm that will give you what you want. Just encode your string as a binary number where all 2 bits pattern encode one of the four symbol. To get all possible permutations with repetitions you just have to enumerate "count" all possible numbers. That can be quite long (more than the age of the universe) as a 1000 symbols will be 2000 bits long. Is it really what you want to do ? The heap overflow may not be the only limit...

Below is a trivial C implementation that enumerates all repetitions of length exactly n (n limited to 16000 with 32 bits unsigned) without allocating memory. I leave to the reader the exercice of enumerating all repetitions of at most length n.

#include <stdio.h>

typedef unsigned char cell;
cell a[1000];
int npack = sizeof(cell)*4;

void decode(cell * a, int nbsym)
{
    unsigned i;
    for (i=0; i < nbsym; i++){
        printf("%c", "GATC"[a[i/npack]>>((i%npack)*2)&3]);
    }
    printf("\n");
}

void enumerate(cell * a, int nbsym)
{
    unsigned i, j;
    for (i = 0; i < 1000; i++){
        a[i] = 0;
    }
    while (j <= (nbsym / npack)){
        j = 0;
        decode(a, nbsym);
        while (!++a[j]){
            j++;
        }
        if ((j == (nbsym / npack))
        && ((a[j] >> ((nbsym-1)%npack)*2)&4)){
            break;
        }
    }
}

int main(){
    enumerate(a, 5);
}
kriss
A: 

You know how to count: add 1 to the ones spot, if you go over 9 jump back to 0 and add 1 to the tens, etc..

So, if you have a list of length N with K items in each spot:

int[] permutations = new int[N];
boolean addOne() {  // Returns true when it advances, false _once_ when finished
  int i = 0;
  permutations[i]++;
  while (permutations[i] >= K) {
    permutations[i] = 0;
    i += 1;
    if (i>=N) return false;
    permutations[i]++;
  }
  return true;
}
Rex Kerr