tags:

views:

71

answers:

3

Suppose I have a data frame with columns c1, ..., cn, and a function f that takes in the columns of this data frame as arguments. How can I apply f to each row of the data frame to get a new data frame?

For example,

x = data.frame(letter=c('a','b','c'), number=c(1,2,3))
# x is
# letter | number
#      a | 1
#      b | 2
#      c | 3

f = function(letter, number) { paste(letter, number, sep='') }

# desired output is
# a1
# b2
# c3

How do I do this? I'm guessing it's something along the lines of {s,l,t}apply(x, f), but I can't figure it out.

+3  A: 
paste(x$letter, x$number, sep = "")
Greg
This is the way I would have done it! It's almost like we've been mentored by the same R master ;-)
Vince
A: 

I think you were thinking of something like this, but note that the apply family of functions do not return data.frames. They will also attempt to coerce your data.frame to a matrix before applying the function.

apply(x,1,function(x) paste(x,collapse=""))

So you may be more interested in ddply from the plyr package.

> x$row <- 1:NROW(x)
> ddply(x, "row", function(df) paste(df[[1]],df[[2]],sep=""))
  row V1
1   1 a1
2   2 b2
3   3 c3
Joshua Ulrich
+3  A: 

as @greg points out, paste() can do this. I suspect your example is a simplification of a more general problem. After struggling with this in the past, as illustrated in this previous question, I ended up using the plyr package for this type of thing. plyr does a LOT more, but for these things it's easy:

> require(plyr)
> adply(x, 1, function(x) f(x$letter, x$number))
  X1 V1
1  1 a1
2  2 b2
3  3 c3

you'll want to rename the output columns, I'm sure

So while I was typing this, @joshua showed an alternative method using ddply. The difference in my example is that adply treats the input data frame as an array. adply does not use the "group by" variable row that @joshua created. How he did it is exactly how I was doing it until Hadley tipped me to the adply() approach. In the aforementioned question.

JD Long
Thanks for pointing me to `adply` (via Hadley). ;-)
Joshua Ulrich
You could simplify this with `transform` or `summarize`:`adply(x, 1, summarize, paste(letter, number, sep = ""))`
JoFrhwld
Awesome, thanks! Yep, my example was just a toy example. I looked at plyr+reshape a while ago, and didn't understand it =(, but I'll definitely have to take a look again.
Brett
@JoFrhwld, you are exactly right about simplifying. The example Hadley gave me does exactly that. I didn't want to simplify too much, however, as I wanted a general answer that could be applied to other things.
JD Long