I have a question which I think involves "conditional entropy" in the field of information theory. I am trying to wrap my head around it, but could use some help. Consider an example in which we have four houses. In the first house there are eight people, four people live in the second house, and there are two people in third house, and two people in the fourth house. So, four houses and sixteen people. If I simply choose one of these people at random, then that choice is a selection from among sixteen people, yielding an information entropy of 4 bits for that choice.
But now consider a two-step selection in which first I choose one house at random, and then I choose one of the people in the selected house. So the first step, that of picking one house from the four houses available, generates two bits of information entropy. But now, in the 25% of the time that I pick the first house, the second step adds another three bits in the choosing of one person from among the eight people in the first house. In another 25% of the cases, I need only another two bits to select one person from the four that live in the second house. And finally, in fully half of the cases, I need only a single bit to pick one person from the pair that lives in either the third or the fourth house.
Somehow, it seems to me, that the weighted average of bit-counts for the two-step approach should generate the same four bit total that the single-step method requires. But I can not get the figures to add up, so clearly there is more to the math than I am considering. I was expecting that you should simply be able to add up the probabilities like so:
(picking a house) + (picking a person in that house) ==
log(4) + [(1/4)*log(8) + (1/4)*log(4) + (1/4)*log(2) + (1/4)*log(2)]
But this produces a result of 3.75 bits, and not the 4 bits that I am expecting. Here is a bit of python that I used to evaluate this.
from math import log
def log2(x):
return log(x,2)
x = log2(4) + ((1.0/4)*log2(8) + (1.0/4)*log2(4) + (1.0/4)*log2(2) + (1.0/4)*log2(2))
print x
So, something is missing from my figures. Can anyone point me in the right direction?
Many thanks, -- jeffh