Input : random vector X=xi, i=1..n.
vector of means for X=meanxi, i=1..n
Output : covariance matrix Sigma (n*n).
Computation :
1) find all cov(xi,xj)= 1/n * (xi-meanxi) * (xj-meanxj), i,j=1..n
2) Sigma(i,j)=cov(xi,xj), symmetric matrix.
Is this algorithm correct and has no side-effects?
views:
99answers:
1
+1
A:
Each xi
should be a vector (random variable) with it's own variance and mean.
Covariance matrix is symmetric, so you just need to compute one half of it (and copy the rest) and has variance of xi at main diagonal.
S = ...// your symmetric matrix n*n
for(int i=0; i<n;i++)
S(i,i) = var(xi);
for(j = i+1; j<n; j++)
S(i,j) = cov(xi, xj);
S(j,i) = S(i,j);
end
end
where variance (var) of xi:
v = 0;
for(int i = 0; i<xi.Count; i++)
v += (xi(i) - mean(xi))^2;
end
v = v / xi.Count;
and covariance (cov)
cov(xi, xj) = r(xi,xj) * sqrt(var(xi)) * sqrt(var(xj))
where r(xi, xj)
is Pearson product-moment correlation coefficient
EDIT
or, since cov(X, Y) = E(X*Y) - E(X)*E(Y)
cov(xi, xj) = mean(xi.*xj) - mean(xi)*mean(xj);
where .*
is Matlab-like element-wise multiplication.
So if x = [x1, x2]
, y = [y1, y2]
then z = x.*y = [x1*y1, x2*y2]
;
Gacek
2010-07-22 09:23:26
The diagonal contains the variances.
Henrik
2010-07-22 09:34:30
Yes, you are right. Corrected
Gacek
2010-07-22 09:36:42
Why do you define covariance in terms of correlation? Usually it's done the other way round.
Henrik
2010-07-22 09:52:33
@Gacek: I agree with Henrik. What you have written is correct, however, the correlation is defined in terms of covariance, so it is redundant to calculate the correlation and then obtain covariance from it.
Il-Bhima
2010-07-22 11:01:32
Since i've brought vector of means back (was computed previously, i missed that), variance is computed easily as sum(x(i)-mean(i)/n). Now what about covariance itself - are there any ways to compute it simpler? Btw, in your formula sj is mis-typed xj i guess.
Singularity
2010-07-22 11:59:03
Right. I simplified the formula. I was not sure if that is correct, so I posted the solution with Pearson at the first moment. But I checked it in Matlab and it seems to work fine.
Gacek
2010-07-22 14:47:48
Ok, even if in my case of x=y={x1,x2,..} this formula would work, what should we do with the mean of product in brackets? mean(xi) is available, but not mean(xi*xj)... All i have in the beginning is X={1,2,3....} - vector of single dimension random variables (numbers), and vector EX={0.1,0.2,0.3,..} composed by means for each xi.
Singularity
2010-07-22 16:25:14
p.s. Dug deeper into theory, found a small remark - X is "centered vector random process with mathematical expectation of zero".
Singularity
2010-07-22 16:40:49