views:

138

answers:

3

I have this method:

public static int what(String str, char start, char end)
{
    int count=0;
    for(int i=0;i<str.length(); i++) {
        if(str.charAt(i) == start)
        {
            for(int j=i+1;j<str.length(); j++)
            {
                if(str.charAt(j) == end)
                    count++;
            }
        }
    }
    return count;
}

What I need to find is:

1) What is it doing? Answer: counting the total number of end occurrences after EACH (or is it? Not specified in the assignment, point 3 depends on this) start.

2) What is its complexity? Answer: the first loops iterates over the string completely, so it's at least O(n), the second loop executes only if start char is found and even then partially (index at which start was found + 1). Although, big O is all about worst case no? So in the worst case, start is the 1st char & the inner iteration iterates over the string n-1 times, the -1 is a constant so it's n. But, the inner loop won't be executed every outer iteration pass, statistically, but since big O is about worst case, is it correct to say the complexity of it is O(n^2)? Ignoring any constants and the fact that in 99.99% of times the inner loop won't execute every outer loop pass.

3) Rewrite it so that complexity is lower.
What I'm not sure of is whether start occurs at most once or more, if once at most, then method can be rewritten using one loop (having a flag indicating whether start has been encountered and from there on incrementing count at each end occurrence), yielding a complexity of O(n).

In case though, that start can appear multiple times, which most likely it is, because assignment is of a Java course and I don't think they would make such ambiguity.
Solving, in this case, is not possible using one loop... WAIT! Yes it is..!
Just have a variable, say, inc to be incremented each time start is encountered & used to increment count each time end is encountered after the 1st start was found:

inc = 0, count = 0
if (current char == start) inc++
if (inc > 0 && current char == end) count += inc

This would also yield a complexity of O(n)? Because there is only 1 loop.

Yes I realize I wrote a lot hehe, but what I also realized is that I understand a lot better by forming my thoughts into words...

+1  A: 
Matthew Flaschen
Actually, I think you need to look ahead one position for the end character as in the original it doesn't increment the count for the start character when start == end. Consider the case when the string is "s" and start = "s" and end = "s". If you don't look ahead, your get a count of 1 - you erroneously count the start character because it is the same as the end character.
tvanfosson
Good point. That can be fixed by switching the order of the ifs.
Matthew Flaschen
As I understood it, you begin counting occurrences of end only after the 1st start has been spotted, and then incrementing the amount added to count for every other start, or am I wrong? This is the reason I put the check inc > 0, because if inc is 0, that means no start has been spotted yet.
timeNomad
@time, yes, but if inc is 0, there is no harm in adding it.
Matthew Flaschen
That's very stupid of me.. =]
timeNomad
A: 

1) Yes.
2) Yes, the complexity is at best O(n) and at worst O(n^2).
3) Almost right. You need to check for the end character before the start character, otherwise the result is not always correct. Example in C#:

public static int what(string str, char start, char end) {
  int count = 0, mul = 0;
  foreach (char c in str) {
    if (c == end) count += mul;
    if (c == start) mul++;
  }
  return count;
}
Guffa
A: 
  • As already pointed out by Matthew, the complexity of original method is O(n2)
  • Your second approach is not correct, original method seems to be counting all substrings starting with start and ending with end, which can also can contain either start or end.
  • It is possible to do this in O(n), just find all starts, all ends, and then iterate over two lists moving pointers appropriately.
vartec