statistics

Efficient algorithm for detecting different elements in a collection

Imagine you have a set of five elements (A-E) with some numeric values of a measured property (several observations for each element): A = {100, 110, 120, 130} B = {110, 100, 110, 120, 90} C = { 90, 110, 120, 100} D = {120, 100, 120, 110, 110, 120} E = {110, 120, 120, 110, 120} First, I have to detect if there are significant differen...

Cumulative Normal Distribution Function in C/C++

I was wondering if there were statistics functions built into math libraries that are part of the standard C++ libraries like cmath. If not, can you guys recommend a good stats library that would have a cumulative normal distribution function? Thanks in advance. More specifically, I am looking to use/create a cumulative distribution fun...

How to do: Correlation with "blocks" (or - "repeated measures" ?!) ?

Hello dear R people, I have the following setup to analyse: We have about 150 subjects, and for each subject we performed a pair of tests (under different conditions) 18 times. The 18 different conditions of the test are complementary, in such a way so that if we where to average over the tests (for each subject), we would get no correl...

What is log-likelihood?

What is Log-likelihood? An example would be great. ...

Generating order statistics grouped by order total

Hopefully I can explain this correctly. I have a table of line orders (each line order consists of quantity of item and the price, there are other fields but I left those out.) table 'orderitems': orderid | quantity | price 1 | 1 | 1.5000 1 | 2 | 3.22 2 | 1 | 9.99 3 | 4 | 0.44 3 ...

Stats: Popularity of CSS

Is there anything on this matter on the web? Because I can't seem to find anything. Edit: Sorry about being unspecific. Normally I hate these guys, too. I'm looking for stats on how CSS became popular around 2000-2004. ...

how to generate a gaussian distribution using mysql user-defined function.

I like to use MySQL to do quantitative analysis and statistics. I would like to make a MySQL user-defined function of the form: sample_gaussian(mean, stdev) that returns a single randomized value sampled from a gaussian distribution having mean and standard deviation of the user-entered arguments. MySQL already has a function rand() tha...

Should I remove banned contents on my database?

Hi, I'm in front of decision on how to do with flagged data by the users. The banned data could be an image, a wiki, an user and something else need to be managed like a message board. I'd like to work with user stats in many cases, to find users with bad behaviour, users with many activity, users with best photos and so on with all I c...

Measuring process statistics in Linux

I am building programming contest software. A user's program is received by our judging system and is evaluated by compiling it and running it via a fork() and exec(). The parent process waits for the child (submission's process) to exit, and then cleans it up. To give useful information about the program's run, I want to measure the CP...

Athletic Result Rating System

I'm trying to develop an alternative rating system for athletic results. We're all aware of the traditional first past the post rating system for races. Think of the 100m final in the Olympics. First gets gold, second silver, etc. This system only benefits the top three. In my system, there is a series/league of races, where all eight ...

How to measure or represent change in a value between time periods?

I need to provide some indication of the way a value is changing between time periods. This value could move in any of the following ways: 0 -> some positive value some positive value -> 0 positive value -> larger positive value positive value -> smaller positive value I have initially looked at providing a % change value, however thi...

Most efficient way to count occurrences?

I'm looking to calculate entropy and mutual information a huge number of times in performance-critical code. As an intermediate step, I need to count the number of occurrences of each value. For example: uint[] myArray = [1,1,2,1,4,5,2]; uint[] occurrences = countOccurrences(myArray); // Occurrences == [3, 2, 1, 1] or some permutation...

Finding 99% coverage in Matlab

i have a matrix in matlab and i need to find the 99% value for each column. That means that value such that 99% of the population has larger value than this. Is there a function in matlab for this? ...

Looking for a Histogram Binning algorithm for decimal data

I need to generate bins for the purposes of calculating a histogram. Language is C#. Basically I need to take in an array of decimal numbers and generate a histogram plot out of those. Haven't been able to find a decent library to do this outright so now I'm just looking for either a library or an algorithm to help me do the binning...

How can I get aov to show me the F-statistic and p-value?

The following script #!/usr/bin/Rscript --vanilla x <- c(4.5,6.4,7.2,6.7,8.8,7.8,9.6,7.0,5.9,6.8,5.7,5.2) fertilizer<- factor(c('A','A','A','A','B','B','B','B','C','C','C','C')) crop <- factor(c('I','II','III','IV','I','II','III','IV','I','II','III','IV')) av <- aov(x~fertilizer*crop) summary(av) yields Df Sum Sq Me...

Swing / Java2D statistics and visualisation libraries

Hi everyone, I'm looking for a multifaceted Java2D / Swing visualisation library with which I can render different statistics. Specifically, I'm looking for timeline plotting (with a configurable scrolling and compressing timeline and the ability to chart events at certain points along the timeline), line charts, pie charts, and so on, b...

Storing statistics of multple data types in SQL Server 2008

I am creating a statistics module in SQL Server 2008 that allows users to save data in any number of formats (date, int, decimal, percent, etc...). Currently I am using a single table to store these values as type varchar, with an extra field to denote the datatype that it should be. When I display the value, I use that datatype field ...

Generating "too perfect" random numbers

A good RNG ought to pass several statistical tests of randomness. For example, uniform real values in the range 0 to 1 can be binned into a histogram with roughly equal counts in each bin, give or take some due to statistical fluctuations. These counts obey some distribution, I don't recall offhand if it's Poisson or binomial or what, ...

How to formally prove that Geometric distribution is the discrete analogous of the Exponential one?

How to formally prove that Geometric distribution is the discrete analogous of the Exponential one? ...

How to get statistics on all accounts in WHM?

I effectively want to be able to get a list of all my clients in WHM, and compare their hit stats (stored in awstats in cpanel) in order to rank them by popularity. Anybody know how to do this? ...