views:

3978

answers:

4

This is a good one because it's so counter-intuitive:

Imagine an urn filled with balls, two-thirds of which are of one color and one-third of which are of another. One individual has drawn 5 balls from the urn and found that 4 are red and 1 is white. Another individual has drawn 20 balls and found that 12 are red and 8 are white. Which of the two individuals should feel more confident that the urn contains two-thirds red balls and one-third white balls, rather than vice-versa? What odds should each individual give?

I know the right answer, but maybe I don't quite get the odds calculation. Can anyone explain?

+3  A: 

I assume that the 'a priori' probability of one hypothesis versus the other is 1/2, and moreover that both the individuals reinsert each ball after extracting it (extractions are independent from each other).

The correct answer is that the second observer should be more confident than the first. My previous answer was wrong due to a trivial error in computations, many thanks and +1 to Adam Rosenfield for his correction.

Let 2/3R 1/3W denote the event "the urn contains 2/3 of red balls and 1/3 white balls", and let 4R,1W denote the event "4 red balls and 1 white ball get extracted". Then, using Bayes's rule,

P[2/3R 1/3W | 4R,1W] = P[4R,1W | 2/3R 1/3W] P[2/3R 1/3W] / P[4R,1W] = (2/3)4 (1/3)1 (1/2) / P[4R, 1W]

Now, since 2/3R 1/3W and 1/3R 2/3W are complementary by hypothesis,

P[4R,1W] = P[4R,1W | 2/3R 1/3W] P[2/3R 1/3W] + P[4R,1W | 1/3R 2/3W] P[1/3R 2/3W] = (2/3)4 (1/3)1 (1/2) + (1/3)4 (2/3)1 (1/2)

Thus,

P[2/3R 1/3W | 4R,1W] = (2/3)4 (1/3)1 (1/2) / { (2/3)4 (1/3)1 (1/2) + (1/3)4 (2/3)1 (1/2) } = 2^4 / (2^4 + 2) = 8/9

The same calculation for P[2/3R 1/3W | 12R,8W] (i.e. having (2/3)12 (1/3)8 instead of (2/3)4 (1/3)1) yields now 16/17, hence the confidence of the second observer is greater than that of the first.

Federico Ramponi
re: the reinsertion -- not necessary if the # of balls is large (probably an equally valid assumption)
Jason S
shouldn't P[4R, 1W | 2/3R 1/3W] = (2/3)^4 * (1/3)^1 * (5 choose 4)? Also, I'm not sure how you came up with a 50% a priori distribution
FryGuy
@FryGuy the 50% (or any other known number!) a priori is a must precondition to make a decision... If I tell you a priori "100% sure that there are 2/3 red balls" then the problem is trivial, both people can be equally confident... too many data missing here, I think
Daniel Daranas
Check your arithmetic - your reasoning is sound, but if you plug in your numbers you should get 8/9 for the first observer and 16/17 for the second observer.
Adam Rosenfield
@Adam Rosenfield: AAARGH! there is a 2^1 that magically becomes 1. Correcting in a minute. Thank you very much!
Federico Ramponi
+2  A: 

P[2/3R 1/3W | 4R, 1W] = (2/3)^4 * (1/3)^1 * (1/2) / { (2/3)^4 * (1/3)^1 * (1/2) + (1/3)^4 * (2/3)^1 * (1/2) } = 2^4 / (2^4 + 1) = 16/17

er,

= ⅔^4*⅓ / (⅔^4*⅓ + ⅓^4*⅔)
= 16/243 / (16/243 + 2/243)
= 16/18

P(⅔R⅓W | 12R8W) does indeed however = 16/17, so the 12R8W can be more confident.

bobince
if that is the case, then how is this problem counter intuitive?more sampling = more confidence, especially when your sample agrees with what you expect
yx
btw, my comment was more directed at the "This is a good one because it's so counter-intuitive:" line the topic creator said.
yx
I don't see how anyone should "intuite" _anything_ from the statement of the problem. One has taken more balls, the other has a stronger red percentage, so both have arguments in their favour of being more confident. You have to calculate and find the result, you can't guess anything.
Daniel Daranas
Yeah, I dunno, unless there's another sneaky arithmetic error caused by my gin intake. I would have guessed 12R8W to be more likely, although I'd not have been at all sure about it...
bobince
@Daneil Daranas: Your comments on the "prime factor of 3*10^11" question were hilarious. Unfortunately, this problem requires *no* calculation and is easy if you know the theory. You're right it's a poor programming question, but it isn't "too long and tedious" and you *can* intuit the answer.
A. Rex
@A. Rex Who is right then, Federico (same probability) or bobince (more one than the other)? And what is the "reasoning" without any calculation?
Daniel Daranas
@Daniel Daranas (pardon my misspelling last time): bobince and Adam Rosenfield are correct because their arithmetic is correct. Please see my explanation for reasoning without calculation, as well as calculation without mistakes.
A. Rex
@bobince: (2/3)^4 * (1/3)^1 * (1/2) / { (2/3)^4 * (1/3)^1 * (1/2) + (1/3)^4 * (2/3)^1 * (1/2) }, you can simplify 1/2 and (1/3)^5 in both numerator and denominator, you are left with 2^4/(2^4 + 2) = 8/9. The (2^4 + 1) was a mistake while I was typing my answer, sorry for that.
Federico Ramponi
Indeed, correct - I left the denominator as it was just to make clear that 16/17>16/18.
bobince
@yx: I suppose that the counter-intuition stems from the fact that the first observer has a sample which is "more biased" toward the 2/3-red-hypothesis (he has a larger fraction of red balls). But remember that the second has a **larger sample**, and both these facts must be taken into account.
Federico Ramponi
What I said. There are two conflicting "reasons for confidence", one for each person, and you can't guess _anything_. Just do the math, either in the normal way or with @A. Rex's interesting shorthand method.
Daniel Daranas
+11  A: 

Eliezer Yudkowsky has a (really, really long, but good) explanation of Bayes' Theorem. About 70% down, there's a paragraph beginning "In front of you is a bookbag" which explains the core of this problem.

The punchline is that all that matters is the difference between how many red and white balls have been drawn. Thus, contrary to what others have been saying, you don't have to do any calculations. (This is making either of the reasonable assumptions (a) that the balls are drawn with replacement, or (b) the urn has a lot of balls. Then the number of balls doesn't matter.) Here's the argument:

Recall Bayes' theorem: P(A|B) = P(B|A) * P(A) / P(B). (A note on terminology: P(A) is the prior and P(A|B) is the posterior. B is some observation you made, and the terminology reflects your confidence before and after your observation.) This form of the theorem is fine, and @bobince and @Adam Rosenfield correctly applied it. However, using this form directly makes you susceptible to arithmetic errors and it doesn't really convey the heart of Bayes' theorem. Adam mentioned in his post (and I mention above) that all that matters is the difference between how many red and white balls have been drawn, because "everything else cancels out in the equations". How can we see this without doing any calculations?

We can use the concepts of odds ratio and likelihood ratio. What is an odds ratio? Well, instead of thinking about P(A) and P(¬A), we will think about their ratio P(A) : P(¬A). Either is recoverable from the other, but the arithmetic works out nicer with odds ratios because we don't have to normalize. Furthermore, it's easier to "get" Bayes' theorem in its alternate form.

What do I mean we don't have to normalize, and what is the alternate form? Well, let's compute. Bayes' theorem says that the posterior odds are

P(A|B) : P(¬A|B) = (P(B|A) * P(A) / P(B)) : (P(B|¬A) * P(¬A) / P(B)).

The P(B) is a normalizing factor to make the probabilities sum to one; however, we're working with ratios, where 2 : 1 and 4 : 2 odds are the same thing, so the P(B) cancels. We're left with an easy expression which happens to factor:

P(A|B) : P(¬A|B) = (P(B|A) * P(A)) : (P(B|¬A) * P(¬A)) = (P(B|A) : P(B|¬A)) * (P(A) : P(¬A))

We've already heard of the second term there; it's the prior odds ratio. What is P(B|A) : P(B|¬A)? That's called the likelihood ratio. So our final expression is

posterior odds = likelihood ratio * prior odds.

How do we apply it in this situation? Well, suppose we have some prior odds x : y for the contents of the urn, with x representing 2/3rds red and y representing 2/3rds white. Suppose we draw a single red ball. The likelihood ratio is P(drew red ball | urn is 2/3rds red) : P(drew red ball | urn is 2/3rds white) = (2/3) : (1/3) = 2 : 1. So the posterior odds are 2x : y; had we drawn a white ball, the posterior odds would be x : 2y by similar reasoning. Now we do this for every ball in sequence; if the draws are independent, then we just multiply all the odds ratios. So we get that if we start with an odds ratio of x : y and draw r red balls and w white balls, we get a final odds ratio of

(x : y) * (2 : 1)^r * (1 : 2)^w = (x * 2^r) : (y * 2^w) = (x : y) * (2^(r-w) : 1).

so we see that all that matters is the difference between r and w. It also lets us easily solve the problem. For the first question ("who should be more confident?"), the prior odds don't matter, as long as they're not 1 : 0 or 0 : 1 and both people have identical priors. Indeed, if their identical prior was x : y, the first person's posterior would be (2^3 * x) : y, while the second person's posterior would be (2^4 * x) : y, so the second person is more sure.

Suppose moreover that the prior odds were uniform, that is 1 : 1. Then the first person's posterior would be 8 : 1, while the second person's would be 16 : 1. We can easily translate these into probabilities of 8/9 and 16/17, confirming the other calculations.

The point here is that if you get the bolded equation above, then this problem is really easy. But as importantly, you can be sure you didn't mess up any arithmetic, because you have to do so little.

So this is a bad programming question, but it is a good test of the bolded equation. Just for practice, let's apply it to two more problems:

I randomly choose one of two coins, a fair coin or a fake, double-headed coin, each with 50% probability. I flip it three times and it comes up heads all three times. What's the probability it's the real coin?

The prior odds are real : fake = 1 : 1, as stated in the problem. The probability that I would have seen three heads with the real coin is 1 / 8, but it's 1 with the fake coin, so the likelihood ratio is 1 : 8. So the posterior odds are = prior * likelihood = 1 : 8. Thus the probability it's the real coin is 1 / 9.

This problem also brings up an important caveat: there is a possibly different likelihood ratio for every possible observation. This is because the likelihood ratio for B is P(B|A) : P(B|¬A), which is not necessarily related to the likelihood ratio for ¬B, which is P(¬B|A) : P(¬B|¬A). Unfortunately, in all the examples above, they've been inverses of each other, but here, they're not.

Indeed, suppose I flip the coin once and get tails. What's the probability it's the real coin? Obviously one. How does Bayes' theorem check out? Well, the likelihood ratio for this observation is the probability of seeing this outcome with the real coin versus the fake coin, which is 1/2 : 0 = 1 : 0. That is, seeing a single tails kills the probability of the coin's being fake, which checks out with our intuition.

Here's the problem I mentioned from Eliezer's page:

In front of you is a bookbag containing 1,000 poker chips. I started out with two such bookbags, one containing 700 red and 300 blue chips, the other containing 300 red and 700 blue. I flipped a fair coin to determine which bookbag to use, so your prior probability that the bookbag in front of you is the red bookbag is 50%. Now, you sample randomly, with replacement after each chip. In 12 samples, you get 8 reds and 4 blues. What is the probability that this is the predominantly red bag? (You don't need to be exact - a rough estimate is good enough.)

The prior odds are red : blue = 1 : 1. The likelihood ratios are 7 : 3 and 3 : 7, so the posterior odds are (7 : 3)^8 * (3 : 7)^4 = 7^4 : 3^4. At this point we just estimate 7 : 3 as, say, 2 : 1, and get 2^4 : 1 = 16 : 1. Our final answer is even greater, so it's definitely bigger than 95% or so; the right answer is around 96.7%. Compare this with most people's answers, which are in the 70--80% range.

I hope you agree that problems become really easily, and intuitive, when viewed in this light.

A. Rex
PS. I think for the "who should feel more confident" part, it doesn't actually matter if you're drawing with replacement. It does, of course, matter for the probability calculations.
A. Rex
I had to read it a couple times, but I think I get it... :)
Daniel Daranas
Great! Thanks for reading, Daniel.
A. Rex
+9  A: 

Let A be the event that 2/3 of the balls are red, and then ¬A is the event that 2/3 of the balls are white. Let B be the event that the first observer sees 4 red balls out of 5, and let C be the event that the second observer sees 12 red balls out of 20.

Applying some simple combinatorics, we get that

  • P(B|A) = (5 choose 4)(2/3)4(1/3)1 = 80/243
  • P(BA) = (5 choose 4)(1/3)4(2/3)1 = 10/243

Therefore, from Bayes' Law, observer 1 has a confidence level of 80/(80+10) = 8/9 that A is true.

For the second observer:

  • P(C|A) = (20 choose 12)(2/3)12(1/3)8 = 125970 * 212/320
  • P(CA) = (20 choose 12)(1/3)12(2/3)8 = 125970 * 28/320

So again from Bayes' Law, observer 2 has a confidence level of 212/(212 + 28) = 16/17 that A is true.

Therefore, observer two has a higher confidence level that 2/3 of the balls are red. The key is to understand how Bayes' Law works. In fact, all that matters is the difference in the number of red and white balls observed. Everything else (specifically the total number of balls drawn) cancels out in the equations.

Adam Rosenfield
Adam, if you haven't seen this calculation done with odds and likelihood ratios, take a look at my post. I hope you enjoy it.
A. Rex