probability

Representing probability in C++

I'm trying to represent a simple set of 3 probabilities in C++. For example: a = 0.1 b = 0.2 c = 0.7 (As far as I know probabilities must add up to 1) My problem is that when I try to represent 0.7 in C++ as a float I end up with 0.69999999, which won't help when I am doing my calculations later. The same for 0.8, 0.80000001. Is...

Nth Combination

Is there a direct way of getting the Nth combination of an ordered set of all combinations of nCr? Example: I have four elements: [6, 4, 2, 1]. All the possible combinations by taking three at a time would be: [[6, 4, 2], [6, 4, 1], [6, 2, 1], [4, 2, 1]]. Is there an algorithm that would give me e.g. the 3rd answer, [6, 2, 1], in the o...

Logic / Probability Question: Picking from a bag

I'm coding a board game where there is a bag of possible pieces. Each turn, players remove randomly selected pieces from the bag according to certain rules. For my implementation, it may be easier to divide up the bag initially into pools for one or more players. These pools would be randomly selected, but now different players would be...

Probability of SHA1 collisions

Given a set of 100 different strings of equal length, how can you quantify the probability that a SHA1 digest collision for the strings is unlikely... ? ...

probability of deck of cards

for(i=1;i<=n;i++) { pick a random index j between 1 and n inclusive; swap card[i] and card[j]; } for the above code am trying to find the probability of original card[k] winding up in slot n is 1/n? I guess it's (n-1)/n * 1/(n-1)=1/n. But can u help me proving this? ...

Estimating/forecasting download completion time

We've all poked fun at the 'X minutes remaining' dialog which seems to be too simplistic, but how can we improve it? Effectively, the input is the set of download speeds up to the current time, and we need to use this to estimate the completion time, perhaps with an indication of certainty, like '20-25 mins remaining' using some Y% con...

How do I compute a PMF and CDF for a binomial distribution in MATLAB?

I need to calculate the probability mass function, and cumulative distribution function, of the binomial distribution. I would like to use MATLAB to do this (raw MATLAB, no toolboxes). I can calculate these myself, but was hoping to use a predefined function and can't find any. Is there something out there? function x = homebrew_binomia...

Implementation of a simple algorithm (to calculate probability)

EDIT: I've got it, thanks for all the help everyone! + Cleaned up post a little bit. Also, this article was very helpful: http://www.codinghorror.com/blog/archives/001204.html?r=1183 Hi all, I've been asked (as part of homework) to design a Java program that does the following: Basically there are 3 cards: Black coloured on bot...

Team matchups for Dota Bot

I have a ghost++ bot that hosts games of Dota (a warcraft 3 map that is played 5 players versus 5 players) and I'm trying to come up with good formulas to balance the players going into a match based on their records (I have game history for several thousand games). I'm familear with some of the concepts required to match up players, li...

Efficiently estimating the number of unique elements in a large list

This problem is a little similar to that solved by reservoir sampling, but not the same. I think its also a rather interesting problem. I have a large dataset (typically hundreds of millions of elements), and I want to estimate the number of unique elements in this dataset. There may be anywhere from a few, to millions of unique eleme...

Group detection in data sets

Assume a group of data points, such as one plotted here (this graph isn't specific to my problem, but just used as a suitable example): Inspecting the scatter graph visually, it's fairly obvious the data points form two 'groups', with some random points that do not obviously belong to either. I'm looking for an algorithm, that would ...

Why am I getting dups with random.shuffle in Python?

For a list of 10 ints, there are 10! possible orders or permutations. Why does random.shuffle give duplicates after only 5000 tries? >>> L = range(10) >>> rL = list() >>> for i in range(5000): ... random.shuffle(L) ... rL.append(L[:]) ... >>> rL = [tuple(e) for e in rL] >>> len(set(rL)) 4997 >>> for i,t in enumerate(rL): ... ...

How can I generate random samples from bivariate normal and student T distibutions in C++?

Hi, what is the best approach to generate random samples from bivariate normal and student T distributions? In both cases sigma is one, mean 0 - so the only parameter I am really interested in is correlation (and degrees of freedom for student t). I need to have the solution in C++, so I can't unfortunately use already implemented funct...

Select random k elements from a list whose elements have weights

Selecting without any weights (equal probabilities) is beautifully described here. I was wondering if there is a way to convert this approach to a weighted one. I am also interested in other approaches as well. Update: Sampling without replacement ...

How can I run a loop against 2 random elements from a list at a time?

Let's say I have a list in python with several strings in it. I do not know the size. How can I run a loop to do an operation on 2 random elements of this string? What if I wanted to favour a certain subset of the strings in this randomization, to be selected more often, but still make it possible for them to not be chosen? ...

Determining if the difference between two error values is significant

I'm evaluating a number of different algorithms whose job is to predict the probability of an event occurring. I am testing the algorithms on large-ish datasets. I measure their effectiveness using "Root Mean Squared Error", which is the square root of the ((sum of the errors) squared). The error is the difference between the predicte...

Generating a probability distribution

Given an array of size n I want to generate random probabilities for each index such that Sigma(a[0]..a[n-1])=1 One possible result might be: 0 1 2 3 4 0.15 0.2 0.18 0.22 0.25 Another perfectly legal result can be: 0 1 2 3 4 0.01 0.01 0.96 0.01 0.01 How can I generate these easily and quick...

How do I evaluate the effectiveness of an algorithm that predicts probabilities?

I need to evaluate the effectiveness of algorithms which predict the probability of something occurring. My current approach is to use "root mean squared error", ie. the square root of the mean of the errors squared, where the error is 1.0-prediction if the event occurred, or prediction if the event did not occur. The algorithms have n...

Probability theory and project planning

Hello, everyone, I'm managing a project that has to be estimated, according to rough requirements and specifications. Because of that, the estimations on the specific features and tasks are set of discrete values, instead of just one discrete value (for example, between 10 and 20, instead of exactly 17). I'm curious, if I want to get a...

Calculating conditional probabilities from joint pmfs in numpy, too slow. Ideas? (python-numpy)

I have a conjunctive probability mass function array, with shape, for example (1,2,3,4,5,6) and I want to calculate the probability table, conditional to a value for some of the dimensions (export the cpts), for decision-making purposes. The code I came up with at the moment is the following (the input is the dictionary "vdict" of the f...