views:

155

answers:

7

Hi i am trying to calculate the complexity of the following algorithm

private static List<int> GetIndexes(string strippedText, string searchText)
    {
        List<int> count = new List<int>();
        int index = 0;
        while (strippedText.Length >= index && index != -1)
        {
            index = strippedText.IndexOf(searchText.Trim(), index,
                                         StringComparison.OrdinalIgnoreCase);
            if (index != -1)
            {
                count.Add(index);
                index++;
            }
            else continue;
        }
        return count;
    }

I know that the loop has a complexity of O(n) if count increments by 1 on each iteration but, the increments depends from the indexes found. on each iteration it could increment 1 or strippedText.lenght().

Any idea?

Thanks

+4  A: 

Still, the worst case is O(n).

jldupont
Have a look at the Big O page on Wikipedia: http://en.wikipedia.org/wiki/Big_O_notation
jldupont
O(-) will be the worst scenarioand only in the average case will take in consideration others factors ???
From the Wikipedia page: "A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function;"
jldupont
A: 

I would still say that it is O(n) (or at least At worst O(n), unless your index can be made less than -1 say -10, which would reset it further)

But yes O(n).

astander
+1  A: 

O(n) where n is the length of the strippedText string.

Worst case, every character will be equal to the searchText and will result in n itterations, but even if that isn't the case (which I am assuming it won't be) your average case will be some factor, c, of n where c is greater than zero but less than 1, so the number of loops will be cn, but a constant factor of n will still be represented as O(n).

James Conigliaro
Could you explain me > your average case will be some factor, c, of n where c is greater than zero but less than 1, so the number of loops will be cn, but a constant factor of n will still be represented as O(n).i do not understand itthanks
Big O notation deals with growth rates and is used to guage how the runtime of the algorithm changes as the size of your data set changes. By definition O-notation provides an asymptotic upper bounds to within a constant factor. So O(cn) = O(n) regardless of whether c is 0.00001 or c is 100000000. It still grows at the same rate with respect to n, thus when describing the runtime of an algorithm, constant factors are typically thrown out.
James Conigliaro
A: 

Actually it's O(strippedText.Length*searchText.Length) since index of makes a comparison between characters in searchText and strippedText.

RA
+3  A: 

It is still O(n) - this is because it grows asymptotically at the same rate as O(n)

Big O notation is used for the upper-bound of algorithmic growth - that is, it's a function that the algorithm is guaranteed to grow at the same rate as, or slower than.

In the average case your algorithm will grow at rate O(n/m) where m is some estimate of the how dense the indexes are in your strings (0 = no indexes, 1 = index at every character). Assuming that's roughly constant over n you can ignore the m and still say the algorithm is O(n).

The fact that, in the real world, your algorithm will almost certainly be faster than O(n) doesn't stop it begina n O(n) function.


Take a look at this page, particularly:

The symbol O is used to describe an asymptotic upper bound for the magnitude of a function in terms of another, simpler function. This means that for x > k, when x tends to infinity, the value f(x) will always be inferior to C *g(x) (with C a constant).

Mark Pim
m is dependent on `searchText`, so it's not constant. Consider search text with many repetitions (like 'ababab').
Abgan
Yes, but as long as it's not dependent on the *length* of `searchText` then my point stands
Mark Pim
That is, if `m` is basically orthogonal to `n` then `O(n/m) = O(n)`
Mark Pim
+1 for clear explanation of O notation.
Yann Schwartz
A: 

I'd say that it's O(n/m) where strippedText.Length == n and m is the length of a prefix of searchText that is a suffix of searchText (if no such prefix exist it's the length of searchText).

Examples:

searchText = 'ababab' --> m == 2

searchText = 'aaaaaa' --> m == 1

searchText = 'abcabc' --> m == 3

searchText = 'abcdefg' --> m == 7

Abgan
A: 

It all depends on how the strings IndexOf() method is implemented. Basically you just search through the whole string with IndexOf() - how efficient that is depends on the efficiency of that method.

There are several string search algorithms of different complexities ranging from O(m+n) to O(m*n) (m and n being the length of the two strings).

sth