ansaurus

Question

Answer 1

+3 A:

Clearly I should have worked on this for another hour before I posted my question. It's so obvious in retrospect. :)

To use R's vector logic I took out the loop and replaced it with this:

st <-   sample(c(12,17,24),10000,prob=c(20,30,50),replace=TRUE)
p1 <-   sample(c(12,17,24),10000,prob=c(20,30,50),replace=TRUE)
p2 <-   sample(c(12,17,24),10000,prob=c(20,30,50),replace=TRUE)
year <- rep(1991:2000,1000)

I can now do 100,000 samples almost instantaneous. I knew that vectors were faster, but dang. I presume 100,000 loops would have taken over an hour using a loop and the vector approach takes <1 second. Just for kicks I made the vectors a million. It took ~2 seconds to complete. Since I must test to failure, I tried 10mm but ran out of memory on my 2GB laptop. I switched over to my Vista 64 desktop with 6GB ram and created vectors of length 10mm in 17 seconds. 100mm made things fall apart as one of the vectors was over 763mb which resulted in an allocation issue with R.

Vectors in R are amazingly fast to me. I guess that's why I am an economist and not a computer scientist.

JD Long 2009-01-13 18:00:54

They look cool, never having seen the R language before.

Joe Philllips 2009-01-13 18:06:37

JD: Investigate do.call, sapply, lapply, and tapply. These were turning points in R for me. Anonymous functions are useful too.

Vince 2009-09-09 18:11:47

Answer 2

+2 A:

To answer your question about why the loop of 10000 took much longer than your loop of 1000:

I think the primary suspect is the concatenations that are happening every loop. As the data gets longer R is probably copying every element of the vector into a new vector that is one longer. Copying a small (500 elements on average) data set 1000 times is fast. Copying a larger (5000 elements on average) data set 10000 times is slower.

David Locke 2009-01-13 22:09:25

That is exactly it. Thank you for pointing that out.

JD Long 2009-01-13 23:14:57

today I figured out the faster way to add elements to a vector: appendso the year vector would now look like years <-append(years,year, after=length(years))

JD Long 2009-01-14 20:43:18

That's unlikely to be much faster - you need to preallocate.

hadley 2009-07-31 19:41:58

ansaurus

tags:

views:

answers:

Thinking in Vectors with R

related questions