views:

185

answers:

1

In a MySQL (5.1) database table there is data that represents:

  • how long a user takes to perform a task and
  • how many items the user handled during the task.

Would MySQL support correlating the data or do I need to use PHP/C# to calcuate?

Where would I find a good formula to calculate correlation (it's been a long time since I last did this)?

+2  A: 

Here's a rough implementation of the sample correlation coefficient as described in:

Wikipedia - Correlation and Dependence

create table sample( x float not null, y float not null );
insert into sample values (1, 10), (2, 4), (3, 5), (6,17);

select @ax := avg(x), 
       @ay := avg(y), 
       @div := (stddev_samp(x) * stddev_samp(y))
from sample;

select sum( ( x - @ax ) * (y - @ay) ) / ((count(x) -1) * @div) from sample;
+---------------------------------------------------------+
| sum( ( x - @ax ) * (y - @ay) ) / ((count(x) -1) * @div) |
+---------------------------------------------------------+
|                                       0.700885077729073 |
+---------------------------------------------------------+
Martin
Thanks Martin.Works great - I got a correlation of .39 - a little weak but on the right track.
John M