views:

255

answers:

5

What is the fastest way in R to compute a recursive sequence defined as

x[1] <- x1 
x[n] <- f(x[n-1])

I am assuming that the vector x of proper length is preallocated. Is there a smarter way than just looping?

Variant: extend this to vectors:

 x[,1] <- x1 
 x[,n] <- f(x[,n-1])
+1  A: 

Well if you need the entire sequence how fast it can be? assuming that the function is O(1), you cannot do better than O(n), and looping through will give you just that.

mfeingold
Yeah, there is usually a little constant ahead of O()...
gappy
And even worse, you can sometimes get O(n^2) in R if you do things like replace the whole object every time you modify part of it. I think the original question was about avoiding that problem...
Harlan
+1  A: 

In general, the syntax x$y <- f(z) will have to reallocate x every time, which would be very slow if x is a large object. But, it turns out that R has some tricks so that the list replacement function [[<- doesn't reallocate the whole list every time. So I think you can reasonably efficiently do:

x[[1]] <- x1 
for (m in seq(2, n))
    x[[m]] <- f(x[[m-1]])

The only wasteful aspect here is that you have to generate an array of length n-1 for the for loop, which isn't ideal, but it's probably not a giant issue. You could replace it by a while loop if you preferred. The usual vectorization tricks (lapply, etc.) won't work here...

(The double brackets give you a list element, which is what you probably want, rather than a singleton list.)

For more details, see Chambers (2008). Software for Data Analysis. p. 473-474.

Harlan
+1  A: 

You could consider writing it in C / C++ / Fortran and use the handy inline package to deal with the compiling, linking and loading for you.

Of course, your function f() may be a real constraint if that one needs to remain an R function. There is a callback-from-C++-to-R example in Rcpp but this requires a bit more work than just using inline.

Dirk Eddelbuettel
Dirk, can one pass an R function pointer to C? I thought not.
gappy
Yes, I just looked at it again in the Rcpp sources. It's a little messy in the docs which I need to reorganise -- look at the end of the example in 'example(RcppExample)' once Rcpp is loaded, and the corresponding code in src/RcppExample.cpp as well as Rcpp.h and Rcpp.cpp. Not trivial, but doable. Would be nice as a timing exercise.
Dirk Eddelbuettel
+2  A: 

In terms of the question of whether this can be fully "vectorized" in any way, I think the answer is probably "no". The fundamental idea behind array programming is that operations apply to an entire set of values at the same time. Similarly for questions of "embarassingly parallel" computation. In this case, since your recursive algorithm depends on each prior state, there would be no way to gain speed from parallel processing: it must be run serially.

That being said, the usual advice for speeding up your program applies. For instance, do as much of the calculation outside of your recursive function as possible. Sort everything. Predefine your array lengths so they don't have to grow during the looping. Etc. See this question for a similar discussion. There is also a pseudocode example in Tim Hesterberg's article on efficient S-Plus Programming.

Shane
+3  A: 

Solve the recurrence relationship ;)

hadley