tags:

views:

225

answers:

1

The cast() function is great at calculating margins for aggregate values:

cast(df, IDx1+IDx2~IDy1, margins=c('IDx1','IDx2','grand_row'),c(min, mean, max))

The problem is that I need to weight my means using a second vector and a custom function.

Of course, ddply() lets me apply custom aggregation functions to my grouped records:

ddply(d, IDx1+IDx2~IDy1 , function(x) 
c(
min(x$value),
MyFancyWeightedHarmonicMeanFunction(x$value,x$weight),
max(x$value)
)
)

...and this is awesome.

But what would really save the day is the ability to do both things at once, whether by calling the two-vector function in cast() or by faking somehow the margins=() variable in ddply().

Is this possible?

+1  A: 

It's pretty to compute the margins yourself:

ddply(d, "IDX1", ...) 
ddply(d, c("IDX1", "IDX2"), ...)
ddply(d, "IDy1", ...)

and then combine the results together with rbind. It wouldn't be too hard to wrap this up into a general function.

Also, I'd rewrite your original code as:

ddply(d, IDx1+IDx2~IDy1, summarise, 
  min = min(value),
  wt.mean = MyFancyWeightedHarmonicMeanFunction(value, weight),
  max = max(value)
)
hadley
Hadley,Thanks for the tip. The separate ddply operations, rbinded together, is exactly what I've done. Still getting my head around both summarise and transform.
MW Frost