views:

157

answers:

3

Let's say I have 20 players [names A .. T] in a tournament. The rules of the tournament state that each player plays every other player twice [A vs B, B vs A, A vs C .. etc]. With 20 players, there will be a total of 380 matches.

In each match, there are three possible outcomes - player 1 wins, player 2 wins, or draw. There's a betting exchange which, ahead of each match, quotes the probabilites of each outcome occuring; so you might have 40% player 1 wins, 30% player 2 wins, 30% draw [probabilities sum to 100%]; I store these probabilities ahead of each match.

Fast forward one quarter of the way through the tournament. I have collected probabilities for 95 games, with 285 still to go. What I want to know is -

Can the probability data from the 95 games be used to predict probabilities for the remaining 285 ?

For example, if I know A vs B and B vs C, can I use them to infer A vs C ?

And if so, how do I do it ?

+3  A: 

Let me introduce you to my good friend Bayes... http://en.wikipedia.org/wiki/Bayesian%5Finference

Edit: Part 1) Bayes will only work for non-independent trials. If winning one game somehow increases your probability of winning the next, you can carry on! Otherwise this isn't very helpful at all.

Edit: Part 2) Regardless, the base is the following Bayes' Formula.

P(A|B) = P(B|A) P(A)
         -----------
             P(B)

Which is read, "The probability of A given B is equal to Prob. B given A times Prob of A all over Prob. of B". To illustrate this, the car salesman with 3 doors problem is often given.

You have 3 doors and behind one door there's a brand new car. The other two doors have absolutely nothing. The host then asks you to pick a door. Remember, there is door 'A', 'B' and 'C'. Therefore, you have a 1/3 probability of being correct.

The host, being a generous guy, opens one of the other doors. He now gives you the option of either sticking with the same door or opening the other door.

I realized that explaining this in a Stackoverflow reply would take forever and just googled it. This is the Monty Hall problem: http://en.wikipedia.org/wiki/Monty%5FHall%5Fproblem. http://en.wikipedia.org/wiki/Monty%5FHall%5Fproblem#Bayesian%5Fanalysis for the Bayes section.

Edit: Part 3) You may want to look up 'Bayesian Networks' if you decide this sort of approach can work (but on a much grander scheme)

Malaxeur
Can you explain how it would apply to this particular problem ?
Justin
There are a lot of examples on the page, but the premise is that given certain probabilities you can 'infer' other probabilities given certain conditions. It's very vague, but it works. I'd reply in the comment however take a look at the edit for more information
Malaxeur
Re: Edit 1, What if he were modeling each individual match instead of the tournament as a whole?
Joe Holloway
A: 

You'll probably be able to make some semi-decent predictions for most games. For example, if you have chess players A, B, and C, where A beats B and B beats C, A will probably beat C too. However, there are some cases where that won't work out at all. To give a simple counterexample, if it's a rock-paper-scissors contest, and A always picks rock, B picks scissors, and C picks paper, obviously you're not going to get the same type of correlation.

Your best bet is to test things out with a small subset if you're able, maybe using some preexisting data if you can find some. Read in 1/4 of the cases, make your predictions based on that set, and see how well the predictions work out.

Randy
+2  A: 

You may or may not be able to predict game outcomes, depending on the games. I believe what you're looking at is still an active research area, but there are reasonable solutions out there. Basically you're hoping that you can rank the players, such that a player of a higher rank will generally beat a player of a lower rank. Different models tweak this a bit, e.g. with the probability of winning being a function of the difference in rank.

One approach is to use simulated annealing to find these ranks. Pick some function for the game outcome as a function of the ranks of the players, and let the fitness of a given rank assignment be the probability of the observed outcome given the chosen ranks. Repeat with different ranks, as per simulated annealing.

redtuna
Right, this is precisely what I had been thinking. One correction: I'm not predicting the game outcomes, I'm predicting the market's expectation of the outcomes
Justin