views:

82

answers:

1

I recently found a "DNA Computing" Algorithm (not genetic programming or genetic algorithms) that attempts to find the Hamiltonian Path in a graph, but I'm a little confused by the pseudo code... note that the notation is a little messed up because I copied it from a PDF paper on DNA computing:

Input: for each node v and edge (u; v),
  Tv contains Sv (and Sv)
  T0uv contains Suv (and Suv)
Mix(fTi; T0 uvg,T)
Remove(T,T0,fSfg)
Remove(T0,T0,fStg)
Move length 20n+10 strings from T0 to T00
if Detect(T0')
then return ``Yes''
else return ``No''

For pseudo code with better notation and background information please see the paper. Could somebody translate it into better pseudo code? I want to try this algorithm, but I don't get what they're doing with Mix -> Remove -> Remove then Move (20n+10)... any hints?

P.S. This algorithm is O(n^2) so I'm suspecting it may be a version similar to an already existing algorithm.

+1  A: 

As surmised in some of the comments, the algorithm is intended to be carried out on test tubes rather than silicon.

Input: for each node v and edge (u; v),
  Tv contains Sv (and Sv)
  T0uv contains Suv (and Suv)

So we initialize with a test tube for each node in the graph, containing a DNA fragment that represents that node, and a test tube for each edge in the graph. The DNA sequences representing nodes are randomly generated. We don't want them to be able to anneal (stick) to each other. The DNA sequences representing the edges are designed so that they overlap with those of the nodes that they connect to. Which means that the node DNA will stick to edge DNA if that node is connected to the edge. (This is vital point that makes this whole technique work). The DNA sequences need to be amplified (both to have enough DNA, and so that the DNA concentration is proportional to the number of edges between the same two nodes).

Mix(fTi; T0 uvg,T)
Remove(T,T0,fSfg)
Remove(T0,T0,fStg)

Here we put all of the DNA sequences in a tube, and run the Polymerase Chain Reaction. In the PCR, we use primers which represent the start and end nodes for our Hamiltonian path. The result will be strings of DNA which begin with the Start node sequence, and then go edge-node-edge until the finish node is reached.

Move length 20n+10 strings from T0 to T00
if Detect(T0')
then return ``Yes''
else return ``No''

We then run the DNA on a gel, and pick out the DNA sequence with the right length. This should be a Hamiltonian path. One thing that isn't completely clear for me is how they prevent duplicate visits to a vertex, but think that comes naturally out of having the correct concentrations, as described above.

The reason that this is interesting is that the chemistry is essentially running these computations in parallel. All possible paths are checked -- since all compatible combinations of DNA fragments are created -- but the PCR and gel select only the one representing the Hamiltonian Path. And this isn't something you can do on a computer without exponential time.

Rob Lachlan
@Rob, so is the performance O(n^2) when performed with chemicals or is it only O(n^2) when implemented with code? I would think the performance is the same even if it's running in parallel...
Lirik
It's only O(n^2) when done in test tubes. (The limiting factor is the requirement to prepare a sample of each DNA fragment for nodes and edges.) In parallel, in this case means exponentially parallel. That's the magic of the chemistry -- we can, in effect, run 2^n computations in parallel. That's something you can't do with a computer. I should mention that the DNA approach has limits too -- we can only run PCR on up to around 10000 base pairs, I believe.
Rob Lachlan