ansaurus

Question

Answer 1

+1 A:

Inner loop could be vectorized

cluster[i,4] <- paths[max(which(data[i]==paths[,1])),2]

but check Musa's comment. I think you indented something else.

Second (outer) loop could be vectorize either, by replicating vectors but

if i is only 100 your speed-up don't be large
it will need more RAM

[edit] As I understood your comment can you just use logical indexing?

indx <- data==paths[, 1]
cluster[indx, 4] <- paths[indx, 2]

Marek 2010-06-02 12:23:30

Answer 2

+1 A:

I think that both loops can be vectorized using the following:

cluster[na.omit(match(paths[1:100,1],data[1:10])),4] = paths[!is.na(match(paths[1:100,1],data[1:10])),2]

gd047 2010-06-03 08:20:45

I wonder how the performance of your vectorized solution compares to the looping alternative.

Guido 2010-06-04 06:59:35

@Guido In this particular case it's hard to say cause results from original loop and gd047 solution differ, but in general difference between loop and vectorized code could be huge. Check my answer to http://stackoverflow.com/questions/2908822/speed-up-the-loop-operation-in-r, where from hours you can go to less than second.

Marek 2010-06-04 19:40:10

@Marek Using randomized test matrices I got equal cluster matrices using both methods. I checked the results using `all.equal(loop_sol,vect_sol)` Which are the the test matrices that you have used and gave you different results?

gd047 2010-06-04 20:25:58

@gd047 Check this http://sites.google.com/site/fsh9rss8heh/ (too long for comment), I use R-2.10.1

Marek 2010-06-04 22:15:28

@Marek Thanks. You are right. In my examples there were not more than one matches between data[i] and paths[j,1]. In the general case where there are more than one, the dominant is the one that is checked last. I am not sure which one dominates in the vectorized way. Do you have any idea?

gd047 2010-06-05 06:17:57

@gd047 As states in `help("match")` return **positions of (first) matches**, so you could write you own version using rev `match_last <- function(x,y) length(y)-match(x,rev(y))+1`

Marek 2010-06-05 12:24:41

@Marek Nice idea but there's still a difference.

gd047 2010-06-05 14:28:41

ansaurus

tags:

views:

answers:

Avoid the use of loops (for) with R

related questions