tags:

views:

108

answers:

3

i have one question suppose there is given two String

String s1="  MARTHA  ";
 String s2=" MARHTA":

here we echange position of T and H i am interested to write code which count how many change is necessary to transform from one String to another String

+4  A: 

There are several edit distance algorithms, the given Wikipeida link has links to a few.

Donnie
None of those take only swaps into consideration.
IVlad
A: 

Your problem is not so easy, since before counting the swaps you need to ensure that every swap reduces the "distance" (in equality) between these two strings. Then actually you look for the count but you should look for the smallest count (or at least I suppose), otherwise there exists infinite ways to swap a string to obtain another one.

You should first check which charaters are already in place, then for every character that is not look if there is a couple that can be swapped so that the next distance between strings is reduced. Then iterate over until you finish the process.

If you don't want to effectively do it but just count the number of swaps use a bit array in which you have 1 for every well-placed character and 0 otherwise. You will finish when every bit is 1.

Jack
And this ensures the minimum number of swaps how? You could end up in an infinite loop if you just blindly swap elements, or at least in a dead end that doesn't finish converting the string.
IVlad
The iteration constraint is to reduce the distance between the strings. How can you end up in a infinite loop if you ensure that every step reduces the distance? It can stuck, assuring that the two strings are not "swap equal" but it does guarantee not to loop just without doing anything.The approach is called _greedy_ that ensures that, if local optimum is kept (by reducing distance ad every iteration), then global optimum is a direct consequence.
Jack
Then I'm assuming we're talking about swaps of two items `i` and `j` where there is no constraints like `i = j+1` or viceversa. Also because the OP didn't say adjacent swaps but just swaps..
Jack
@Jack - yes, it's called greedy, and you must prove that your greedy works, because greedy algorithms usually only work for a very narrow and specific type of problems. You won't end up in an infinite loop, but you might get stuck when the two are actually "swap equal" and if not you might not find the best solution. Basically, you need to prove that your algorithm is correct.
IVlad
As I pointed out in my response, if we assume that there are no repeating digits, then this can be solved in linear time. I would like to hear your opinions on how I can improve it to cover the case of non-unique letters as well.
Eyal Schneider
A: 

Assuming that the distance counts only swaps, here is an idea based on permutations, that runs in linear time.

The first step of the algorithm is ensuring that the two strings are really equivalent in their character contents. This can be done in linear time using a hash table (or a fixed array that covers all the alphabet). If they are not, then s2 can't be considered a permutation of s1, and the "swap count" is irrelevant.

The second step counts the minimum number of swaps required to transform s2 to s1. This can be done by inspecting the permutation p that corresponds to the transformation from s1 to s2. For example, if s1="abcde" and s2="badce", then p=(2,1,4,3,5), meaning that position 1 contains element #2, position 2 contains element #1, etc. This permutation can be broke up into permutation cycles in linear time. The cycles in the example are (2,1) (4,3) and (5). The minimum swap count is the total count of the swaps required per cycle. A cycle of length k requires k-1 swaps in order to "fix it". Therefore, The number of swaps is N-C, where N is the string length and C is the number of cycles. In our example, the result is 2 (swap 1,2 and then 3,4).

Now, there are two problems here, and I think I'm too tired to solve them right now :)

1) My solution assumes that no character is repeated, which is not always the case. Some adjustment is needed to calculate the swap count correctly.

2) My formula #MinSwaps=N-C needs a proof... I didn't find it in the web.

Eyal Schneider