views:

227

answers:

4

There is a picture book with 100 pages. If dice are rolled randomly to select one of the pages and subsequently rerolled in order to search for a certain picture in the book -- how do I determine the best, worst and average case complexity of this problem?

Proposed answer:

best case: picture is found on the first dice roll

worst case: picture is found on 100th dice roll or picture does not exist

average case: picture is found after 50 dice rolls (= 100 / 2)

Assumption: incorrect pictures are searched at most one time

+2  A: 

To analyze this, think about what the best, worst and average cases actually are. You need to answer three questions to find those three cases:

  1. What is the fewest number of rolls to find the desired page?
  2. What is the largest number of rolls to find the desired page?
  3. What is the average number of rolls to find the desired page?

Once you find the first two, the third should be less tricky. If you need asymptotic notation as opposed to just the number of rolls, think about how the answers to each question change if you change the number of pages in the book (e.g. 200 pages vs 100 pages vs 50 pages).

Eric Perko
+2  A: 

The worst case is not the page found after 100 dice rolls. That would be is your dice always returned different numbers. The worst case is that you never find the page (the way you stated the problem).

The average case is not average of the best and worst cases, fortunately.

The average case is:

  1 * (probability of finding page on the first dice roll)
+ 2 * (probability of finding page on the second dice roll)
+ ...

And yes, the sum is infinite, since in thinking about the worst case we determined that you may have an arbitrarily large number of dice rolls. It doesn't mean that it can't be computed (it could mean that, but it doesn't have to).

The probability of finding the page on the first try is 1/100. What's the probability of finding it on the second dice roll?

Pascal Cuoq
Fortunately, with constant `k` and `r`, the sum over positive integers `i` of `(i * k*(r^i))` converges for `|r| < 1`. So any similar series can be computed. Proof uses the fact that for sufficiently large i, `|((i+1)/i)*r| < 1`.
Steve Jessop
+4  A: 

Given your description of the problem, I don't think your assumption (that incorrect pictures are only "searched" once) sounds right. If you don't make that assumption, then the answer is as shown below. You'll see the answers are somewhat different from what you proposed.

  1. It is possible that you could be successful on the first attempt. Therefore, the first answer is 1.
  2. If you were unlucky, you could keep rolling the wrong number forever. So the second answer is infinity.
  3. The third question is less obvious.

What is the average number of rolls? You need to be familiar with the Geometric Distribution: the number of trials needed to get a single success.

  • Define p as the probability for a successful trial; p=0.01.
  • Let Pr(x = k) be the probability that the first successful trial being the k th. Then we're going to have to have (k-1) failures and one success. So Pr(x=k) = (1-p)^(k-1) * p. Verify that this is the "probability mass function" on the wiki page (left column).
  • The mean of the geometric distribution is 1/p, which is therefore 100. This is the average number of rolls required to find the specific picture.

(Note: We need to consider 1 as the lowest possible value, rather than 0, so use the left hand column of the table on the Wikipedia page.)

John
+1  A: 

You're almost there, but (1 + 2 + ... + 100)/100 isn't 50.

It might help to observe that your random selection method is equivalent to randomly shuffling the whole deck and then searching it in order for your target. Each position is equally likely, so the average is straightforward to calculate. Except of course that you aren't doing all that work up front, just as much as is needed to generate each random number and access the corresponding element.

Note that if your book were stored as a linked list, then the cost of moving from each randomly-selected page to the next selection depends on how far apart they are, which will complicate the analysis quite a lot. You don't actually state that you have constant-time access, and it's possibly debatable whether a "real book" provides that or not.

For that matter, there's more than one way to choose random numbers without repeats, and not all of them have the same running time.

So, you'd need more detail in order to analyse the algorithm in terms of anything other than "number of pages visited".

Steve Jessop