views:

154

answers:

1

Hi,

I have been trying to wrap my head around this for a while now but have not been able to come up with a good solution. Here goes:

Given a number of sets:

set1: A, T
set2: C
set3: A, C, G
set4: T
set5: G

I want to generate all possible sequences from a list of sets. In this example the length of the sequence is 5, but it can be any length up to around 20. For position 1 the possible candidates are 'A' and 'T' respectively, for position 2 the only option is 'C' and so on.

The answer for the example above would be:

ACATG, ACCTG, ACGTG, TCATG, TCCTG, TCGTG

I am doing this in ruby and I have the different sets as arrays within a master array:

[[A, T], [C], [A, C, G], [T], [G]]

At first I thought a recursive solution would be best but I was unable figure out how to set it up properly.

My second idea was to create another array of the same size with an index for each set. So 00000 would correspond to the first sequence above 'ACATG' and 10200 would correspond to 'TCGTG'. Beginning with 00000 I would increase the last index by one and modulo it with the length of the set in question (2 for set1, 1 for set2 above) and if the counter wrapped around I would zero it and increase the previous one by one.

But the more I thought about this solution it seemed too complex for this very small problem. There must be a more straight-forward solution that I am missing. Could anyone help me out?

/Nick

+3  A: 

The Array class in Ruby 1.8.7 has an Array#product method, which returns the cartesian product of the arrays in question.

irb(main):001:0> ['A', 'T'].product(['C'], ['A', 'C', 'G'], ['T'], ['G'])
=> [["A", "C", "A", "T", "G"], ["A", "C", "C", "T", "G"], ["A", "C", "G", "T", "G"],     ["T", "C", "A", "T", "G"], ["T", "C", "C", "T", "G"], ["T", "C", "G", "T", "G"]]
sluukkonen
If you put four spaces before the code, it gets formatted as code.
Martinho Fernandes
Yeah, I just noticed. Didn't bother to RTFM. :)
sluukkonen
I found an alternative package which does the same thing too: http://rubyforge.org/projects/cartesian
reprazent74