This question came today in the manipulatr mailing list.
http://groups.google.com/group/manipulatr/browse_thread/thread/fbab76945f7cba3f
I am rephrasing.
Given a distance matrix (calculated with dist
) apply a function to the rows of the distance matrix.
Code:
library(plyr)
N <- 100
a <- data.frame(b=1:N,c=runif(N))
d <- dist(a,diag=T,upper=T)
sumd <- adply(as.matrix(d),1,sum)
The problem is that to apply the function by row you have to store the whole matrix (instead of just the lower triangular part. So it uses too much memory for large matrices. It fails in my computer for matrices of dimensions ~ 10000.
Any ideas?