ansaurus

Question

Number of simple mutations to change one string to another?

Answer 1

+1 A:

If you have a reasonably sized (small) dictionary, a breadth first tree search might work.

So start with all words your word can mutate into, then all those can mutate into (except the original), then go down to the third level... Until you find the word you are looking for.

You could eliminate divergent words (ones further away from the target), but doing so might cause you to fail in a case where you must go through some divergent state to reach the shortest path.

Bill K 2010-05-13 23:40:29

Well, I have my search algorithm implemented (A*), which accounts for divergent words pretty well (the same way that it can find the best path around a mountain by moving away from the mountain first and going around, instead of always picking the closest point); it has a neat priority system, but all of it relies on a reliable Minimum Distance heuristic; in pathfinding, that's a straight line, ignoring all obstacles. This would be the linguistic equivalent.

Justin L. 2010-05-14 07:13:56

So then I don't know of any way except for trying every path and finding the shortest. Given the two words and taking your first step (including your algorithm) how many words would you expect to have to check branching off the first word? If it's just 10 or so you could probably just do a breadth-first search of the entire tree. If it's much more you might have to do a depth-first until you hit a depth of 3 or so then do a breadth-first of that node just to stay within memory constraints. With chess programs I think they do this but are good at throwing away bad paths.

Bill K 2010-05-14 16:29:20

Answer 2

+3 A:

You want the minimum edit distance (or Levenshtein distance):

The Levenshtein distance between two strings is defined as the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character. It is named after Vladimir Levenshtein, who considered this distance in 1965.

And one algorithm to determine the editing sequence is on the same page here.

MSN 2010-05-13 23:40:33

that may not apply since he is using english-only words.

Bill K 2010-05-13 23:43:20

actually, this is exactly what I'm looking for; I'm looking for a shortest-distance heuristic that doesn't bother with the dictionary. Thanks =)

Justin L. 2010-05-14 07:10:49

Bear in mind that if you're trying to find the shortest path via valid words, the levenstein distance only provides a lower bound. The option that has the lowest levenstein distance could actually be further from the destination than one with a higher distance.

Nick Johnson 2010-05-16 00:58:48

I'm trying to implement an A* pathfinding algorithm to find the shortest path; the implementation requires a lower-bound heuristic to assist in calculations.

Justin L. 2010-05-20 09:18:44

Answer 3

+1 A:

An excellent reference on "Edit distance" is section 6.3 of the Algorithms textbook by S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani, a draft of which is available freely here.

Dijkstra 2010-05-15 22:34:03

thanks for the link to the text book; it will come quite in handy =)

Justin L. 2010-06-09 04:53:44

ansaurus

tags:

views:

answers:

Number of simple mutations to change one string to another?

related questions