ansaurus

Question

R: How do I best simulate an arbitrary univariate random variate using its probability function?

Answer 1

+2 A:

Use cumulative distribution function http://en.wikipedia.org/wiki/Cumulative%5Fdistribution%5Ffunction

Then just use its inverse. Check here for better picture http://en.wikipedia.org/wiki/Normal%5Fdistribution

That mean: pick random number from [0,1] and set as CDF, then check Value

It is also called quantile function.

ralu 2009-10-20 12:22:22

Answer 2

+1 A:

You could use metropolis-hastings to get samples from the density.

Jonathan Chang 2009-10-20 14:15:24

Answer 3

+5 A:

Here is a (slow) implementation of the inverse cdf method when you are only given a density.

den<-dnorm #replace with your own density

#calculates the cdf by numerical integration
cdf<-function(x) integrate(den,-Inf,x)[[1]]

#inverts the cdf
inverse.cdf<-function(x,cdf,starting.value=0){
 lower.found<-FALSE
 lower<-starting.value
 while(!lower.found){
  if(cdf(lower)>=(x-.000001))
   lower<-lower-(lower-starting.value)^2-1
  else
   lower.found<-TRUE
 }
 upper.found<-FALSE
 upper<-starting.value
 while(!upper.found){
  if(cdf(upper)<=(x+.000001))
   upper<-upper+(upper-starting.value)^2+1
  else
   upper.found<-TRUE
 }
 uniroot(function(y) cdf(y)-x,c(lower,upper))$root
}

#generates 1000 random variables of distribution 'den'
vars<-apply(matrix(runif(1000)),1,function(x) inverse.cdf(x,cdf))
hist(vars)

Ian Fellows 2009-10-20 15:53:52

Answer 4

+3 A:

To clarify the "use Metropolis-Hastings" answer above:

suppose ddist() is your probability density function

something like:

n <- 10000
cand.sd <- 0.1
init <- 0
vals <- numeric(n)
vals[1] <- init 
oldprob <- 0
for (i in 2:n) {
    newval <- rnorm(1,mean=vals[i-1],sd=cand.sd)
    newprob <- ddist(newval)
    if (newprob>oldprob || runif(1)<newprob/oldprob) {
        vals[i] <- newval
    } else vals[i] <- vals[i-1]
   oldprob <- newprob
}

Notes:

completely untested
efficiency depends on candidate distribution (i.e. value of cand.sd). For maximum efficiency, tune cand.sd to an acceptance rate of 25-40%
results will be autocorrelated ... (although I guess you could always sample() the results to scramble them)
may need to discard a "burn-in", if your starting value is weird

The classical approach to this problem is rejection sampling (see e.g. Press et al Numerical Recipes)

Ben Bolker 2009-10-24 18:56:37

ansaurus

tags:

views:

answers:

R: How do I best simulate an arbitrary univariate random variate using its probability function?

related questions