I know that R works most efficiently with vectors and looping should be avoided. I am having a hard time teaching myself to actually write code this way. I would like some ideas on how to 'vectorize' my code. Here's an example of creating 10 years of sample data for 10,000 non unique combinations of state, plan1 and plan2:
st<-NULL
p1<-NULL
p2<-NULL
year<-NULL
i<-0
starttime <- Sys.time()
while (i<10000) {
for (years in seq(1991,2000)) {
st<-c(st,sample(c(12,17,24),1,prob=c(20,30,50)))
p1<-c(p1,sample(c(12,17,24),1,prob=c(20,30,50)))
p2<-c(p2,sample(c(12,17,24),1,prob=c(20,30,50)))
year <-c(year,years)
}
i<-i+1
}
Sys.time() - starttime
This takes about 8 minutes to run on my laptop. I end up with 4 vectors, each with 100,000 values, as expected. How can I do this faster using vector functions?
As a side note, if I limit the above code to 1000 loops on i it only takes 2 seconds, but 10,000 takes 8 minutes. Any idea why?