Most functions for generating lognormally distributed random numbers take the mean and standard deviation of the associated normal distribution as parameters.
My problem is that I only know the mean and the coefficient of variation of the lognormal distribution. It is reasonably straight forward to derive the parameters I need for the standard functions from what I have:
If mu
and sigma
are the mean and standard deviation of the associated normal distribution, we know that
coeffOfVar^2 = variance / mean^2
= (exp(sigma^2) - 1) * exp(2*mu + sigma^2) / exp(mu + sigma^2/2)^2
= exp(sigma^2) - 1
We can rearrange this to
sigma = sqrt(log(coeffOfVar^2 + 1))
We also know that
mean = exp(mu + sigma^2/2)
This rearranges to
mu = log(mean) - sigma^2/2
Here's my R implementation
rlnorm0 <- function(mean, coeffOfVar, n = 1e6)
{
sigma <- sqrt(log(coeffOfVar^2 + 1))
mu <- log(mean) - sigma^2 / 2
rlnorm(n, mu, sigma)
}
It works okay for small coefficients of variation
r1 <- rlnorm0(2, 0.5)
mean(r1) # 2.000095
sd(r1) / mean(r1) # 0.4998437
But not for larger values
r2 <- rlnorm0(2, 50)
mean(r2) # 2.048509
sd(r2) / mean(r2) # 68.55871
To check that it wasn't an R-specific issue, I reimplemented it in MATLAB. (Uses stats toolbox.)
function y = lognrnd0(mean, coeffOfVar, sizeOut)
if nargin < 3 || isempty(sizeOut)
sizeOut = [1e6 1];
end
sigma = sqrt(log(coeffOfVar.^2 + 1));
mu = log(mean) - sigma.^2 ./ 2;
y = lognrnd(mu, sigma, sizeOut);
end
r1 = lognrnd0(2, 0.5);
mean(r1) % 2.0013
std(r1) ./ mean(r1) % 0.5008
r2 = lognrnd0(2, 50);
mean(r2) % 1.9611
std(r2) ./ mean(r2) % 22.61
Same problem. The question is, why is this happening? Is it just that the standard deviation is not robust when the variation is that wide? Or have a screwed up somewhere?