views:

408

answers:

4

Whenever I run large scale monte carlo simulations in S-Plus, I always end up growing a beard while I wait for it to complete.

What are the best tricks for running monte carlo simulations in R? Any good examples of running processes in a distributed fashion?

+8  A: 
  • Using multiple cores/machines should be simple if you're just using parallel independent replications, but be aware of common deficiencies of random number generators (e.g. if using the current time as seed, spawning many processes with one RNG for each might produce correlated random numbers, which leads to invalid results - see e.g. this paper)

  • You might want to use variance reduction to reduce the number of required replications, i.e. to shrink the size of the required sample. More advanced variance reduction techniques can be found in many textbooks, e.g. in this one.

__roland__
+6  A: 

Go read Dirk Eddelbuettel's talks and parallel computing survey.

Jouni K. Seppänen
+3  A: 

Latin Hypercube Sampling is easily applied and has a major influence on the results. Basically you take a latin hypercube sample from a uniform distribution (e.g., using randomLHS() in the package lhs) and transform this to your desired distribution using e.g., qnorm(uniformsample).

+3  A: 

Preallocate your vectors!

> nsims <- 10000
> n <- 100
> 
> system.time({
     res <- NULL
     for (i in 1:nsims) {
         res <- c(res,mean(rnorm(n)))
     }
 })
   user  system elapsed 
  0.761   0.015   0.783 
> 
> system.time({
     res <- rep(NA, nsims)
     for (i in 1:nsims) {
         res[i] <- mean(rnorm(n))
     }
 })
   user  system elapsed 
  0.485   0.001   0.488 
>
Eduardo Leoni
i used this just yesterday in a model and it decreased my run time by >15%. Certainly worth a line of code.
JD Long
that's a great little trick
Dan