ansaurus

Question

avoidind a loop in R

Answer 1

+4 A:

Changing loop to apply don't speed up your code.
You should vectorize delta_S function or use parallel processing (if you have multi-core processor).

Marek 2010-07-23 09:18:37

I agree, try to vectorize your function!eh, and try to indent your code by using four spaces / return.

ran2 2010-07-23 11:51:53

Answer 2

+3 A:

There are a few SO posts you might want to read on loops/vectorisation in R.

I agree with Marek - try vectorising your function. Another option is to rewrite the time-consuming parts in C or FORTRAN, then load them as shared objects.

nullglob 2010-07-23 09:36:36

Answer 3

+2 A:

You want more speed...

The suggestions to vectorize your function Delta_S are all well intentioned and would be great if it could be done. I'm not sure it can. Nevertheless, it's a little hard for me to see. It seems to me you need to combine columns of a data frame and rows of a matrix in your final outcome. This is going to be time consuming unless you can solve the row or column problems first. I'll get to that in a minute...

Your creation of your gri only requires you to enter (after the ran variables)

gri <- expand.grid(ran1,ran2,ran3,ran4,ran5)
gri[,6] <- NA
gri <- as.matrix(gri)

That's a lot of lines of code removed right there.

You have several vectors that are essentially constants in your code. They are taken from the database but used as vectors repeatedly (data_base$bck, data_base$Ir, data_base$s1, etc). Each needs to be solved once for that entire for loop. The Qr variable only needs to be solved once for all of Rrecon, not for each line. The denominator only needs to be solved once for all of Rrecon, not for each line... etc. Break the problem down doing all of that first. Then apply to your rows of Rrecon.

While it is sometimes true that apply doesn't save you much over a for loop there are various apply family commands that are very much faster than others. And they almost all, almost always, do save some time. Some of them save loads of time. You'll also be surprised to discover that applying small functions in a vector like syntax (thus implying many small for loops) is faster than applying a large function in a C like syntax.

Oh, and the short answer to getting rid of your final for loop (one of many that should be removed) is...

gri[,6]<- apply (Rrecon, 1, function(x){
    delta_S(Ro=as.vector(x)
    ,Rr=data_base$bck, Ir=data_base$Ir, S1=data_base$s1
    ,S2=data_base$s2, S3=data_base$s3, S4=data_base$s4
    ,chromaty="tetra")
    })

That may not get you a big speedup by itself. It would be much faster if you were just passing the numerator and denominator. But that would require separate apply family loops beforehand to solve each little sub calculations in your code (solve delta_f values, then solve numerator, etc).

You might want to also read the RInferno

John 2010-07-23 19:16:38

Answer 4

A:

You have a very simple calculation performed many, many times. I don't think this is vectorizable (although maybe if you posted the source formula, someone could do it in ten lines, but it's really hard to reverse engineer from your 100).

Two general suggestions, that I think are good habits:

(1) for every iteration of this loop, which occurs 10^6 times, you're storing a whole set of constants (the e_, ran_, Length, and gri). Try calculating those outside the loop.

(2) you also do a lot of a <- 2*b, c <- 3*d, e <- a/c calculations. Replace these three equations with one e <- 2*b / 3*d. You will be surprised how much time it saves when done 10^6.

There's a lot of good stuff in there that I'm sure helped prototyping and readability, but if you're doing 10^6 iterations, you really need to trim it down.

richardh 2010-07-23 19:51:18

ansaurus

tags:

views:

answers:

avoidind a loop in R

Update:

related questions