views:

41

answers:

3

dear all

I have an array of strings.

Some of The strings are similar (for example, person is similar to twolegperson, animal is similar to animalgold).

I want to find the strings that are repeated more than 1 times (here person,animal).

Thank you very much faty

A: 

i have this problem. i need to find which strings repeated more than 1 times. about example (person is similar to twolegperson, animal is similar to animalgold) .Both person,twolegperson are Desired, Also about animal,animalgold, both of them.

fateme.hosseini
A: 

Naive pseudocode alogrithm:

int minMatchLen = 3;   // The minimum length of string match required
string stringArray[] = {"person", "twolegperson", "animal", "animalgold"}
for (i = 0; i < stringArray.length, i++) {
    int strLen = stringArray[i].length;
    for (substrIndex = 0; substrIndex < strLen - minMatchLen; substrIndex++) {
        for (substrLen = minMatchLen; substrLen < strLen - substrIndex; substrLen++) {
            string subString = stringArray[i].substr(substrIndex, substrLen);
            bool matchFound = false;
            for (j = i + 1; j < stringArray.length; j++) {
                if stringArray[j].contains(subString) {
                    print("String '" + subString + "' found in '" + stringArray[j] + "'");
                    matchFound = true;
                }
            }
            if (matchFound) print(""String '" + subString + "' found in '" + stringArray[i] + "'");
        }
    }
}             

This basically goes through each string in the array, extracts all possible substrings over a specified minimum length, and then search the strings in the remainder of the array for those substrings. I'm sure there are more elegant and efficient solutions, but this will get the job done. It'll probably be slow for a large array, though.

Andrew Cooper
+1  A: 

You need Generalized Suffix Tree. For implementations see this question.

Andrei