ansaurus

Question

Fitting a density curve to a histogram in R

Answer 1

+8 A:

If I understand your question correctly, then you probably want a density estimate along with the histogram:

X <- c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4))
hist(X, prob=TRUE)            # prob=TRUE for probabilities not counts
lines(density(X))             # add a density estimate with defaults
lines(density(X, adjust=2), lty="dotted")   # add another "smoother" density

Dirk Eddelbuettel 2009-09-30 12:02:42

Answer 2

+2 A:

Here's the way I do it:

foo <- rnorm(100,mean=1,sd=2)
hist(foo,prob=TRUE)
curve(dnorm(x,mean=mean(foo),sd=sd(foo),add=TRUE)

A bonus exercise is to do this with ggplot2 package ...

John Johnson 2009-09-30 13:32:39

However, if you want something that is skewed, you can either do the density example from above, transform your data (e.g. foo.log <- log(foo) and try the above), or try fitting a skewed distribution, such as the gamma or lognormal (lognormal is equivalent to taking the log and fitting a normal, btw).

John Johnson 2009-09-30 13:35:25

But that still requires estimating the parameters of your distribution first.

Dirk Eddelbuettel 2009-09-30 13:48:15

This gets a bit far afield from simply discussing R, as we are getting more into theoretical statistics, but you might try this link for the Gamma: http://en.wikipedia.org/wiki/Gamma_distribution#Parameter_estimationFor lognormal, just take the log (assuming all data is positive) and work with log-transformed data. For anything fancier, I think you would have to work with a statistics textbook.

John Johnson 2009-09-30 14:45:37

I think you misunderstand how both the original poster as well as all other answers are quite content to use non-parametric estimates -- like an old-school histogram or a somewhat more modern data-driven densisty estimate. Parametric estimates are great if you have good reason to suspect a distribution. But that was not the case here.

Dirk Eddelbuettel 2009-09-30 19:25:58

Answer 3

+2 A:

Such thing is easy with ggplot2

library(ggplot2)
dataset <- data.frame(X = c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4)))
ggplot(dataset, aes(x = X)) + geom_histogram(aes(y = ..density..)) + geom_density()

or to mimic the result from Dirk's solution

ggplot(dataset, aes(x = X)) + geom_histogram(aes(y = ..density..), binwidth = 5) + geom_density()

Thierry 2009-09-30 18:30:09

ansaurus

tags:

views:

answers:

Fitting a density curve to a histogram in R

related questions