views:

63

answers:

2

Hello.

I'm trying to write a function to do some often repeated analysis, and one part of this is to count the number of groups and number of members within each group, so ddply to the rescue !, however, my code has a problem....

Here is some example data

> dput(BGBottles)
structure(list(Machine = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 
3L, 3L, 3L, 4L, 4L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"), 
    weight = c(14.23, 14.96, 14.85, 16.46, 16.74, 15.94, 14.98, 
    14.88, 14.87, 15.94, 16.07, 14.91)), .Names = c("Machine", 
"weight"), row.names = c(NA, -12L), class = "data.frame")

and here is my code

foo<-function(exp1, exp2, data) {
 datadesc<-ddply(data, .(with(data, get(exp2))), nrow)
 return(datadesc)
}

If I run this function, I get an error

> foo(exp="Machine",exp1="weight",data=BGBottles)
Error in eval(substitute(expr), data, enclos = parent.frame()) : 
  invalid 'envir' argument

However, if I define my exp1, exp2 and data variables int he global environemtn first, it works

> exp1<-"weight"
> exp2<-"Machine"
> data<-BGBottles
> foo(exp="Machine",exp1="weight",data=BGBottles)
  with.data..get.exp2.. V1
1                     1  3
2                     2  3
3                     3  3
4                     4  3

So, I assume ddply is running outside of the environemtn of the function ? Is there a way to stop this, or am I doing something wrong ?

Thanks

Paul.

+2  A: 

You don't need get:

foo<-function(exp1, exp2, data) {
    datadesc<-ddply(data, exp2, nrow)
    return(datadesc)
}
Marek
+2  A: 

This is an example of this bug: http://github.com/hadley/plyr/issues#issue/3. But as Marek points out, you don't need get here anyway.

hadley