tags:

views:

157

answers:

1

I have a stacked areaplot made with ggplot2:

dists.med.areaplot<-qplot(starttime,value,fill=dists,facets=~groupname,
    geom='area',data=MDist.median, stat='identity') + 
    labs(y='median distances', x='time(s)', fill='Distance Types')+
    opts(title=subt) + 
    scale_fill_brewer(type='seq') +
    facet_wrap(~groupname, ncol=2) + grect #grect adds the grey/white vertical bars

It looks like this: stacked area graph

I want to add a an overlay of the profile of the control graph (bottom right) to all the graphs in the output (groupname==rowH is the control).

So far my best efforts have yielded this:

cline<-geom_line(aes(x=starttime,y=value), 
  data=subset(dists.med,groupname=='rowH'),colour='red')

dists.med.areaplot + cline

problem graph

I need the 3 red lines to be 1 red line that skims the top of the dark blue section. And I need that identical line (the rowH line) to overlay each of the panels.

The dataframe looks like this:

> str(MDist.median)
'data.frame':   2880 obs. of  6 variables:
 $ groupname: Factor w/ 8 levels "rowA","rowB",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ fCycle   : Factor w/ 6 levels "predark","Cycle 1",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ fPhase   : Factor w/ 2 levels "Light","Dark": 2 2 2 2 2 2 2 2 2 2 ...
 $ starttime: num  0.3 60 120 180 240 300 360 420 480 540 ...
 $ dists    : Factor w/ 3 levels "inadist","smldist",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ value    : num  110 117 115 113 114 ...

The red line should be calculated as the sum of the value at each starttime, where groupname='rowH'. I have tried creating cline the following ways. Each results in an error or incorrect output:

#sums the entire y for all points and makes horizontal line
cline<-geom_line(aes(x=starttime,y=sum(value)),data=subset(dists.med,groupname=='rowH'),colour='red') 

 #using related dataset with pre-summed y's 
> cline<-geom_line(aes(x=starttime,y=tot_dist),data=subset(t.med,groupname=='rowH'))
> dists.med.areaplot + cline
Error in eval(expr, envir, enclos) : object 'dists' not found

Thoughts?

ETA:

It appears that the issue I was having with 'dists' not found has to do with the fact that the initial plot, dists.med.areaplot was created via qplot. To avoid this issue, I can't build on a qplot. This is the code for the working plot:

cline.data <- subset(
        ddply(MDist.median, .(starttime, groupname), summarize, value = sum(value)),
        groupname == "rowH") 
cline<-geom_line(data=transform(cline.data,groupname=NULL), colour='red') 

dists.med.areaplot<-ggplot(MDist.median, aes(starttime, value)) +
  grect + nogrid +
  geom_area(aes(fill=dists),stat='identity') + 
  facet_grid(~groupname)+ scale_fill_brewer(type='seq') +
  facet_wrap(~groupname, ncol=2) + 
  cline

resulting in this graphset: alt text

+3  A: 

This Learning R blog post should be of some help:

http://learnr.wordpress.com/2009/12/03/ggplot2-overplotting-in-a-faceted-scatterplot/

It might be worth computing the summary outside of ggplot with plyr.

cline.data <- ddply(MDist.median, .(starttime, groupname), summarize, value = sum(value))
cline.data.subset <- subset(cline.data, groupname == "rowH")   

Then add it to the plot with

last_plot() + geom_line(data = transform(cline.data.subset, groupname = NULL), color = "red")
JoFrhwld
I don't think you want to remove the `groupname` variable.
hadley
If you remove `groupname`, won't that then plot the line over all the facets?
JoFrhwld
Hmm, maybe I misunderstood the question.
hadley
@hadley: I do want to put that rowH line on every single facet. The idea is to let the user easily see how the treatments differ from the control.
dnagirl
@JoFrhwld: I like the way you've made cline.data.subset. It's very clean. Unfortunately, when I plot it the way you suggest, I get the error: `Error in eval(expr, envir, enclos) : object 'dists' not found.` dists is the column in the main dataset that contains the names of the variables and sets the fill. For some reason `geom_line()` is paying attention to it, but I don't know why.
dnagirl
`geom_line()` is just inheriting that `fill` has been mapped to `dists`, even though it won't be using `fill`. So, you need to set `fill` either to `NA` or `NULL` (I'm not sure which) in `geom_line()` like this `geom_line(data = transform(cline.data.subset, groupname = NULL), color = "red", fill = NA)`
JoFrhwld
@JoFrhwld: I ended up making my initial plot (before the control lines) entirely with `ggplot()` rather than building on `qplot()`. I've added the working code to my question. I wouldn't have gotten there without your help. Tx!
dnagirl