ansaurus

Question

Iterating over the big matrix containing 3000 rows and calculate the correlation.

Answer 1

A:

Not tested, but something like this should work I guess

EDIT: corrected code to avoid huge matrix

correl <- NULL
for (i in 1:nrow(datamatrix))
    {
    correl <- apply(datamatrix, 1, function(x){cor(datamatrix[,i], x)})
    write.table(correl, paste("col", i, ".txt", sep="")
    }

nico 2010-07-30 16:07:42

Hm I fear that doesn't fly. Original Poster claimed `datamatrix` was too big for memory.

Dirk Eddelbuettel 2010-07-30 17:02:17

@Dirk Eddelbuettel: hmmm that's true, I assumed he was talking about the output matrix, but the input matrix is huge too... didn't think about that. wasn't there a package to handle huge matrices in memory or am I wrong?

nico 2010-07-30 20:43:58

Thanks! I had problem with my SUSE where I want to use. I will try the code and get back soon.

Ivan 2010-10-06 12:42:07

Answer 2

A:

Thanks Nico! Almost got there after I corrected small bugs. Here I attach my script:

datamatrix=read.table("ref.txt",sep="\t",header=T,row.names=1) correl <- NULL for (i in 1:nrow(datamatrix)) { correl <- apply(datamatrix, 1, function(x){cor(t(datamatrix[,i]))}) write.table(correl, paste(row.names(datamatrix)[i], ".txt", sep="")) }

But I am afraid the function(x) part is of problem, that seems to be t(datamatrix[i,j]), which will calculate corr of any two rows.

Actually I need to iterate through the matrix. first cor(row01, row02) get one correlation between rwo01 and row02; then cor(row01, row03) to get the correlation of row01 and rwo03, ....and till correlation between row01 row30000.Now I got the first column for row01 Row01 1.000 Row02 0.012 Row03 0.023 Row04 0.820 Row05 0.165 Row06 0.230 Row07 0.376 Row08 0.870 and save it to file row01.txt;

Similarly get Row02 Row01 0.012 Row02 1.000 Row03 0.023 Row04 0.820 Row05 0.165 Row06 0.230 Row07 0.376 Row08 0.870 and save it to file row02.txt.

Totally I will get 30000 files. It is stupid, but this can skip the memory limit and can be easily handled for the correlation of a specific row.

Ivan 2010-07-30 20:52:55

ansaurus

tags:

views:

answers:

Iterating over the big matrix containing 3000 rows and calculate the correlation.

related questions