views:

523

answers:

8

I have several data that looks like this:

Vector1_elements = T,C,A
Vector2_elements = C,G,A
Vector3_elements = C,G,T
..... up to ...
VectorK_elements = ...

#Note also that the member of each vector is always 3.

What I want to do is to create all combination of elements in Vector1 through out VectorK. Hence in the end we hope to get this output (using Vector1,2,3):

TCC
TCG
TCT
TGC
TGG
TGT
TAC
TAG
TAT
CCC
CCG
CCT
CGC
CGG
CGT
CAC
CAG
CAT
ACC
ACG
ACT
AGC
AGG
AGT
AAC
AAG
AAT

The problem I am having now is that the following code of mine does that by hardcoding the loops. Since number of Vectors can be varied, we need a flexible way to get the same result. Is there any?

This code of mine can only handle up to 3 Vectors (hardcoded):

#include <iostream>
#include <vector>
#include <fstream>
#include <sstream>
using namespace std;


int main  ( int arg_count, char *arg_vec[] ) {

    vector <string> Vec1;
          Vec1.push_back("T");
          Vec1.push_back("C");
          Vec1.push_back("A");

    vector <string> Vec2;
          Vec2.push_back("C");
          Vec2.push_back("G");
          Vec2.push_back("A");

    vector <string> Vec3;
          Vec3.push_back("C");
          Vec3.push_back("G");
          Vec3.push_back("T");



     for (int i=0; i<Vec1.size(); i++) {
        for (int j=0; j<Vec2.size(); j++) {
            for (int k=0; k<Vec1.size(); k++) {
                cout << Vec1[i] << Vec2[i] << Vec3[k] << endl;
            }
        }
     }



    return 0;
}
A: 

The simplest way to approach this is to use recursion. The function will have one loop in it and will call itself, merging itself with the output of the recursive call. Of course, recursion can be converted to iteration if you're worried about stack space, but at least as a starting point, the recursive solution will probably be easiest for you.

pumpkin
+1  A: 

Combining three vectors is essentially the same as first combining two vectors, and then combining the third one with the result.

So it all boils down to writing a function that can combine two vectors.

std::vector< std::string > combine(std::vector< std::string > const & inLhs, std::vector< std::string > const & inRhs) {
    std::vector< std::string > result;
    for (int i=0; i < inLhs.size(); ++i) {
        for (int j=0; j < inRhs.size(); ++j) {
            result.push_back(inLhs[i] + inRhs[j]);
        }
    }
    return result;
}

And then something like:

std::vector< std::string > result = combine(Vec1, Vec2);
result = combine(result, Vec3);

and so on for every vector you need combined.

Note that it's more the "C++ way" to use input and output iterators i.s.o. passing vectors around, and much more efficient. In the above version the vector gets copied over and over...

I simply used vectors to stay closer to your original code and, hopefully, make more sense to you.

Pieter
+4  A: 

This will do the trick:

void printAll(const vector<vector<string> > &allVecs, size_t vecIndex, string strSoFar)
{
    if (vecIndex >= allVecs.size())
    {
        cout << strSoFar << endl;
        return;
    }
    for (size_t i=0; i<allVecs[vecIndex].size(); i++)
        printAll(allVecs, vecIndex+1, strSoFar+allVecs[vecIndex][i]);
}

Call with:

printAll(allVecs, 0, "");
interjay
@interjay: Thanks. How can I modify your code so that the function return a string instead of printing it?
neversaint
You can push the strings into an additional vector<string> instead of printing. This result vector can either be a global variable, or be passed as a reference to this function.
interjay
+3  A: 

The basic difficulty with recursion here is that you need to keep track of the entire list of indices (or else construct the string incrementally, as another question points out).

An expedient way to handle this problem without constructing additional objects inside the loops is to hand your recursive function a vector of indices, of the same length as the vector of vectors:

void printcombos(const vector<vector<string> >&vec,vector<int>&index,int depth) {
  if(depth==index.length()) {
    for(int i=0; i<depth; ++i) {
      cout<<vec[i][index[i]];
    }
    cout<<endl;
  } else {
    const vector<string> &myvec= vec[depth];
    int mylength= myvec.length();
    for(int i=0; i<mylength; ++i) {
      index[depth]=i;
      printcombos(vec,index,depth+1);
    }
  }
}
comingstorm
+4  A: 

A C++0x solution. Provided, of course, your compiled supports it (currently GCC 4.5 and VS2010, I think).

The following compiles and works with GCC 4.5 using -std=c++0x switch. The use of variadic templates makes it possible to combine arbitrary number of containers. I am sure you can come up with a more idiomatic solution.

#include <vector>       
#include <string>
#include <sstream>
#include <iostream>
#include <algorithm>

typedef std::vector<std::string> myvec;

// Base case.
void combine2(const std::string &row) {
    std::cout << row << std::endl;
}

// Recursive variadic template core function.
template<class T0, class ...T>
void combine2(const std::string &row, const T0& cont0, T...cont_rest) {
    for (auto i = cont0.begin(); i != cont0.end(); ++i) {
        std::stringstream ss;
        ss << row << *i;
        combine2(ss.str(), cont_rest...);
    }
}

// The actual function to call.
template<class ...T>
void combine(T...containers) {
    combine2("", containers...);
}

int main() {
    myvec v1 = {"T", "C", "A"}, v2 = {"C", "G", "A"}, v3 = {"C", "G", "T"};

    combine(v1);
    combine(v1, v2);
    combine(v1, v2, v3);

    // Or even...
    std::vector<std::string> v4 = {"T", "C", "A"};
    std::vector<char> v5 = {'C', 'G', 'A'};
    std::vector<int> v6 = {1 ,2 ,3};

    combine(v4);
    combine(v4, v5);
    combine(v4, v5, v6);

    return 0;
}
Alex B
A: 

Use next_permutation function implemented in std of stl

Vivek
+1  A: 

Since you seem to want each output be the length of individual vectors, and you seem to know that each vector is 3 elements wide from

#Note also that the member of each vector is always 3.

using recursion for a general solution seems a bit overkill here.

You may use something like that :

typedef boost::array<std::string, 3> StrVec;
// basically your hardcoded version corrected (Vec2[j] not [i])
void printCombinations(const StrVec &Vec1,
                       const StrVec &Vec2,
                       const StrVec &Vec3) {
    for (int i=0; i<Vec1.size(); i++) {
        for (int j=0; j<Vec2.size(); j++) {
            for (int k=0; k<Vec3.size(); k++) {
                std::cout << Vec1[i] << Vec2[j] << Vec3[k] << std::endl;
            }
        }
    }
}

void foo() {
    typedef std::vector<StrVec> StrVecLvl2;
    StrVecLvl2 vecs;

    // do whatever with it ...

    // iterate with index instead of iterator only to shorten the code
    for (int i = 0; i < vecs.size(); ++i) {
        for (int j = i+1; j < vecs.size(); ++j) {
            for (int k = j+1; k < vecs.size(); ++k) {
                printCombinations(vecs[i], vecs[j], vecs[k]);
            }
        }
    }
}
Aszarsha
+5  A: 

You can implement this like an odometer, which leads to the following (works for different-sized vectors):

Say you have K vectors in an array v: v[0], v[1], ... v[K-1]

Keep an array of iterators it (size K) into your vectors, starting with it[i] = v[i].begin(). Keep incrementing it[K-1] in a loop. When any iterator hits the end() of the corresponding vector, you wrap it around to begin() and increment the previous iterator also (so when it[K-1] wraps around, you increment it[K-2]). These increments may "cascade" so you should do them in a loop backwards. When it[0] wraps around, you're done (so your loop condition could be something like while (it[0] != v[0].end())

Putting all that together, the loop that does the work (after setting up the iterators) should be something like:

while (it[0] != v[0].end()) {
  // process the pointed-to elements

  // the following increments the "odometer" by 1
  ++it[K-1];
  for (int i = K-1; (i > 0) && (it[i] == v[i].end()); --i) {
    it[i] = v[i].begin();
    ++it[i-1];
    }
  }

If you're interested in complexity, the number of iterator increments that get performed is easy to calculate. For simplicity here I'll assume each vector is the same length N. The total number of combinations is NK. The last iterator gets incremented each time, so that is NK, and moving back through the iterators this count gets divided by N each time, so we have NK + NK-1 + ... N1; this sum equals N(NK - 1)/(N-1) = O(NK). This also means that the amortized cost per-combination is O(1).

Anyway, in short, treat it like an odometer spinning its digit wheels.

Sumudu Fernando
I'm glad to see the non recursive solution.
BCS