tags:

views:

149

answers:

2

Most looping code looks like this

retVal=NULL
for i {
  for j {
    result <- *some function of vector[i] and vector[j]* 
    retVal = rbind(retVal,result)
  }
}

Since this is so common, is there a systematic way of translating this idiom?

Can this be extended to most loops?

+1  A: 

The first goal should to get working code. You are there. Then try some simple optmizations. E.g.

retVal <- matrix(NA, ni, nj)     # assuming your result is scalar
for (i in 1:ni)
   for (j in 1:nj)
       retVal[i][j] <- *some function of yours*

will already run much faster as you do not reallocate memory for each i,j combination.

As for the looping, you can start by replacing the inner loop with something from the apply family. I am not aware of something fully general to answer your question -- it depends what arguments your function takes and what type of return object it produces.

Dirk Eddelbuettel
Gotcha. One generalizable thing I get from your response is that preallocating a matrix is a good idea when you have two subscripts running over a vector. Another idea might be crossing the original vector by itself to make a matrix and then use matrix operations to get a result matrix.
Dan Goldstein
Yes, there _may_ better ways with direct matrix (i.e. two-dim) indexing. Just how can, say, exponeniate all elements of a matrix.But the pre-allocation effect is already huge -- I have an example with timing in my 'intro to high-performance computing with R' tutorials.
Dirk Eddelbuettel
+3  A: 

The plyr package provides a set of general tools for replacing looping constructs when you're work with a big data structure by breaking it into pieces, processing each piece independently and then joining the results back together.

hadley