views:

129

answers:

3

Suppose I have this program, I want to compare 2 input lists. Assume array A and array B. How do I determine the best case and worst case of the function?

Here is my code in [php]:

foreach($array_1 as $k){
    if(!in_array($k, $array_2)){
        array_push($array_2, $k);
    }
}   

What is the best case and worst case of the for loop? Please include some explaination, thank you :)

EDITED:

Since my goal is to compare 2 lists that have at lists 1 element in common. I think my above code is wrong. Here is the updated of my code

foreach($array_1 as $k){
    if(in_array($k, $array_2)){
        array_push($array_3, $k);
    }
}

And I guess it would be:

Best case: O(n)

Worst case: O(N*M)

+1  A: 

Let's do a quick analysis then:

foreach($array_1 as $k)

means that the operation within will be repeated for each element of the array. Let denote the size of the array by N.

The operation within:

if (!in_array($k, $array_2)) {
  array_push($array_2, $k);
}

There are 2 operations here:

  • in_array
  • array_push

array_push is likely to be constant, thus O(1), while in_array is more likely a linear search in array_2 which will take either 1 operation (found as the first element) up to the length of array_2 operations.

Note that in_array represent the only variable here:

  • best case: in_array returns at the first comparison --> all elements of array_1 are the same, and either array_2 was empty or they are equal to its first element. Complexity is O(N) since we have N elements in array_1
  • worst case: each time we examine each element of array_2 --> all elements of array_1 are distinct and they are distinct from the previous elements of array_2. If M is the length of array_2 when it is inputed, then the complexity is along the line of O(N * (N+M) ), (N+M)/2 being the mean time for searching in array_2 as it's growing from M to M+N elements and the constant 2 being discarded in the O notation

Hope this helps.

Matthieu M.
So, the best case is n, worst case big o O(N * (N+M))?
college_stud
Yes, that's exactly it. If `array_2` is empty to begin with, the worst cast is simplified to `O(N^2)`.
Matthieu M.
Oh i see.. Thank you Matthiu, Dr Math :) I just cant figure out the worst case hihi..
college_stud
@Dr Math: I think my code is wrong. I re-wrote the code. Based on your explanation, i think the best and worst case still the same huh? Am I correct?
college_stud
@Matthiue.. can help me clarify? I am trying hard to understand this concept. Please?
college_stud
Actually it's slightly better in the worst case (I edited your post to format the code and adjusted the worst case). Since `array_2` does not grow any longer, the cost of looking up in `array_2` is now fix in each iteration of the loop: `O(M)`. Thus the worst case is now `O(N*M)`.
Matthieu M.
It's been a little while since I've worked with big O notation, but IIRC, `O(N * (N + M)) == O(N^2)` -- that is, you don't say `O(N*(N+M))`, it doesn't exist.
nickf
The big O notation express complexity for asymptotic behavior depending on input characteristics. If you have two uncorrelated inputs, then it seems logical that big O will be expressed in terms of those two inputs.
Matthieu M.
A: 

Generally with such a problem I just look at the algorithm as Dr. Evil and ask, "How can I make this take the most time possible?"

T.E.D.
Do you have better algorithm for this? Okay, my goal is compare 2 lists that have at least 1 element in common. Any idea Dr Evil :) :)
college_stud
Well, its tough to say without knowing more information than you provided. From what I see, Dr. Evil would try to feed you input where every single input is in `array_2`. If Dr. Evil had a chance to know how `in_array` and `array_push` are implemented, he could do even more evil to you.
T.E.D.
A: 

Big O notation is all about approximations. It makes it easy to compare algorithms.

If you imagine your array of elements, a search might be order N (you must look at each item to find the item you want), it might be order Log(N) if you have an ordered collection or it could even be order 1 depending on your collection type.

The important thing here is to look at your algorithm and determine what the key operations are that are repeated.

Foreach is clearly an order N operation, by definition you must operate on each element in your list. O(N)

Next is your if InArray 2. This sounds like a search over an array, which would most likely be unordered so it would be order N (linear search). So your complexity would now be O(N * M). (for each n elements in array 1, perform a search of order N complexity over array 2).

Finally you have an array push. I don't know your environment but this could be order 1 or order N if the array needs to be reallocated and copied in order to grow. Lets assume order 1 to keep it simple. Therefore your complexity in Big O is O(N*M).

So now best case is for each element to find it's counterpart on the first try and perform the array push, which would be O(N * 1 * 1) = O(N).

Worst case is that the each element cannot be found in the second list forcing the full search of all elements in array 2. Therefore complexity is O(N * M).

Your teachers want to understand your thinking so show them your assumptions made. I highly recommend that you read the exact question and information you have been given before relying on the assumptions given here, you may have been told the language/platform which would tell you the exact penalty and algorithms used in each case. Hope that helps :)

Spence