views:

437

answers:

4

Say I have an array of arbitrary size holding single characters. I want to compute all possible combinations of those characters up to an arbitrary length.

So lets say my array is [1, 2, 3]. The user-specified length is 2. Then the possible combinations are [11, 22, 33, 12, 13, 23, 21, 31, 32].

I'm having real trouble finding a suitable algorithm that allows arbitrary lengths and not just permutates the array. Oh and while speed is not absolutely critical, it should be reasonably fast too.

Can you help? Thanks :)

+2  A: 

Just do an add with carry.

Say your array contained 4 symbols and you want ones of length 3.

Start with 000 (i.e. each symbol on your word = alphabet[0])

Then add up:

000 001 002 003 010 011 ...

The algorithm (given these indices) is just to increase the lowest number. If it reaches the number of symbols in your alphabet, increase the previous number (following the same rule) and set the current to 0.

C++ code:

int N_LETTERS = 4;
char alphabet[] = {'a', 'b', 'c', 'd'};

std::vector<std::string> get_all_words(int length)
{
  std::vector<int> index(length, 0);
  std::vector<std::string> words;

  while(true)
  {
    std::string word(length);
    for (int i = 0; i < length; ++i)
      word[i] = alphabet[index[i]];
    words.push_back(word);

    for (int i = length-1; ; --i)
    { 
      if (i < 0) return words;
      index[i]++;
      if (index[i] == N_LETTERS)
        index[i] = 0;
      else
        break;
    }
  }
}

Code is untested, but should do the trick.

Peter Alexander
Sounds interresting and seems to work in my mind. I'll test it out. Thanks for your answer!
milan1612
Accepted your answer as I got it working and it's really a nice approach. Thank you again!
milan1612
+1  A: 

One way to do it would be with a simple counter that you internally interpret as base N, where N is the number of items in the array. You then extract each digit from the base N counter and use it as an index into your array. So if your array is [1,2] and the user specified length is 2, you

Counter = 0, indexes are 0, 0 Counter = 1, indexes are 0, 1 Counter = 2, indexes are 1, 0 Counter = 3, indexes are 1, 1

The trick here will be your base-10 to base-N conversion code, which isn't terribly difficult.

Jim Mischel
+1  A: 

If you know the length before hand, all you need is some for loops. Say, for length = 3:

for ( i = 0; i < N; i++ )
   for ( j = 0; j < N; j++ )
      for ( k = 0; k < N; k++ )
         you now have ( i, j, k ), or a_i, a_j, a_k

Now to generalize it, just do it recursively, each step of the recursion with one of the for loops:

recurse( int[] a, int[] result, int index)
    if ( index == N ) base case, process result
    else
        for ( i = 0; i < N; i++ ) {
           result[index] = a[i]
           recurse( a, result, index + 1 )
        }

Of course, if you simply want all combinations, you can just think of each step as an N-based number, from 1 to k^N - 1, where k is the length.

Basically you would get, in base N (for k = 4):

0000 // take the first element four times
0001 // take the first element three times, then the second element
0002 
...
000(N-1) // take the first element three times, then take the N-th element
1000 // take the second element, then the first element three times
1001 
..
(N-1)(N-1)(N-1)(N-1) // take the last element four times
Larry
I'm aware of the first one - it is not applicable in my case due to arbitrary lengths. Recursion is not an option either, simply because of performance reasons and possible SO. Thanks for your answer though...
milan1612
If you need to list *everything* and you're worried about stack overflow, you should also probably worry about finishing the algorithm before the universe ends. ;) Unless you have a trivial `N = 1` case, even a modest `2^100` will probably take more than you imagine. If you have a specific case in mind, feel free to let me know!
Larry
Agree with Larry here. Stack overflow is the least of your issues. There are *n^k* words here. If *k* becomes any reasonable size, generating all words will take an unfathomable amount of time.
Peter Alexander
+1  A: 

Knuth covers combinations and permutations in some depth in The Art of Computer Programming, vol 1. Here is an implementation of one of his algorithms I wrote some years ago (don't hate on the style, its ancient code):

#include <algorithm>
#include <vector>
#include <functional>
#include <iostream>
using namespace std;

template<class BidirectionalIterator, class Function, class Size>
Function _permute(BidirectionalIterator first, BidirectionalIterator last, Size k, Function f, Size n, Size level)
{
    // This algorithm is adapted from Donald Knuth, 
    //      "The Art of Computer Programming, vol. 1, p. 45, Method 1"
    // Thanks, Donald.
    for( Size x = 0; x < (n-level); ++x )   // rotate every possible value in to this level's slot
    {
        if( (level+1) < k ) 
            // if not at max level, recurse down to twirl higher levels first
            f = _permute(first,last,k,f,n,level+1);
        else
        {
            // we are at highest level, this is a unique permutation
            BidirectionalIterator permEnd = first;
            advance(permEnd, k);
            f(first,permEnd);
        }
        // rotate next element in to this level's position & continue
        BidirectionalIterator rotbegin(first);
        advance(rotbegin,level);
        BidirectionalIterator rotmid(rotbegin);
        rotmid++;
        rotate(rotbegin,rotmid,last);
    }
    return f;
}

template<class BidirectionalIterator, class Function, class Size>
Function for_each_permutation(BidirectionalIterator first, BidirectionalIterator last, Size k, Function fn)
{
    return _permute<BidirectionalIterator,Function,Size>(first, last, k, fn, distance(first,last), 0);
}   





template<class Elem>
struct DumpPermutation : public std::binary_function<bool, Elem* , Elem*>
{
    bool operator()(Elem* begin, Elem* end) const
    {
        cout << "[";
        copy(begin, end, ostream_iterator<Elem>(cout, " "));
        cout << "]" << endl;
        return true;
    }
};

int main()
{

    int ary[] = {1, 2, 3};
    const size_t arySize = sizeof(ary)/sizeof(ary[0]);

    for_each_permutation(&ary[0], &ary[arySize], 2, DumpPermutation<int>());

    return 0;
}

Output of this program is:

[1 2 ]
[1 3 ]
[2 3 ]
[2 1 ]
[3 1 ]
[3 2 ]

If you want your combinations to include repeated elements like [11] [22] and [33], you can generate your list of combinations using the algorithm above, and then append to the generated list new elements, by doing something like this:

for( size_t i = 0; i < arySize; ++i )
{
    cout << "[";
    for( int j = 0; j < k; ++j )
        cout << ary[i] << " ";
    cout << "]" << endl;
}

...and the program output now becomes:

[1 2 ]
[1 3 ]
[2 3 ]
[2 1 ]
[3 1 ]
[3 2 ]
[1 1 ]
[2 2 ]
[3 3 ]
John Dibling
This won't work for anything other than 2-length words. The missing elements from the permutations are not just the ones only containing one symbol, but any that contain multiple of the same symbol. e.g. 121 isn't a permutation of {1,2,3}, and it isn't uniform either, so your algorithm won't generate it.
Peter Alexander
Why the downvote on this?
John Dibling
The question asker seems to like to downvote things. I don't know, I would like to ask that about my solution too. I up voted yours.
Larry
@Larry: Well, I think I see why it was downvoted. Poita_ was wrong in saying that my code doesn't handle arbitrary length strings, but correct in saying that repeated elements aren't dealt with properly. Still, the asker was looking for an algorithm, and I pointed them to TAoCP, which has many. Which in my opinion makes my response perfectly valid.
John Dibling
>The question asker seems to like to downvote things.<Not really. It wasn't me here, though I downvoted your answer Larry. After I realised that you were right, I already weren't allowed to remove my vote anymore.
milan1612