tags:

views:

79

answers:

3

Dear R users,

I have a set of data (1000+ animals) from two seasons (winter and summer) and would like to demonstrate the differences in the gestation length (days) pattern in these two seasons. My data is similar to this:

id <- c(1,2,3,4,5,6,7,8,9,10)
season <- c(1,1,2,2,1,2,1,1,2,1)
gest <- c(114,NA,123,116,NA,120,110,NA,116,119)

data <- cbind(id,season,gest)

I would like to have something like this:

http://had.co.nz/ggplot2/graphics/55078149a733dd1a0b42a57faf847036.png

OR any similar form of graph that would give me a good contrast.

Thank you for all your help,

Bazon

+1  A: 

There is a chart type commonly used to show demographics data, and in particular for directly contrasting two groups in which you wish to emphasize the comparison of subgroups that comprise both groups which are identical to each other along some or all variables other than In the demographics context, the most common application is age structure of males versus females. This seems like it might be a good candidate to effectively visualize your data.

The plot shown below was created using the Base graphics package in R and the (excellent) R Package SVGAnnotation, by Duncan Temple Lang, to create the interactive elements (by re-rendering the image in SVG and post-processing the resultant XML).

(Although the plot was created using R and SVGAnnotate, the image below is from a UK Government Site).

alt text

doug
doug, would you be able to post some code to generate that? Also, what's up with those spikes at age 15 and 40???? Reminds me of this article I was reading the other day: http://blog.revolutionanalytics.com/2010/10/old-wives.html
nico
Hi doug, would you be able to generate the codes for that amazing graph above based on the example data provided.
Bazon
It's cool-looking, but I actually think the density plots above would be better for comparing your (Bazon's) data.
Ben Bolker
A: 

That particular plot that you linked used ggplot2. I'm not really good at using it, so I'll show you how to do it with base graphics

data <- as.data.frame(data)
d1 <- density(data$gest[which(data$season==1)], na.rm=TRUE)
d2 <- density(data$gest[which(data$season==2)], na.rm=TRUE)
plot(d1, ylim=c(0, max(d1$y,d2$y)), xlim=range(c(d1$x, d2$x)),
  main="Length of gestation", xlab="Length (days)", col="blue", lwd=2)
polygon(d1$x, d1$y, col=rgb(0, 0, 1, 0.5), lty=0)
points(d2, t="l", col="red", lwd=2)
polygon(d2$x, d2$y, col=rgb(1, 0, 0, 0.5), lty=0)

Alternatively check out the densityplot function of the lattice package, although I'm not sure how to fill in the lines.

PS: is your dataset that small? Density plots are probably NOT the way to go if that is the case (a scatterplot would be better)

EDIT

If you want to do this with histograms you can do something like:

hist(data$gest[which(data$season==1)], main="Length of gestation", 
    xlab="Length (days)", col=rgb(0, 0, 1, 0.5))
# Note the add=TRUE parameter to superimpose the histograms
hist(data$gest[which(data$season==2)], col=rgb(1, 0, 0, 0.5), add=TRUE)
nico
hi nico, thanks for the great help. I only have ~1000 animals and like you said density plot are probably not the way. Would like to try histograms but also need help in the codes.
Bazon
@Bazon: I guess 1000 animals is not that bad for density plot. Anyway, I updated my answer for histograms.
nico
Thanks, nico! That really helps.
Bazon
+2  A: 
library(ggplot2)
df <- data.frame(id=id,season=season,gest=gest)
qplot(gest,data=df,geom="density",fill=season,alpha=I(0.2))

This should give something similar to that example, but you may want to play with the alpha parameter to get the transparency right.

James