tags:

views:

48

answers:

2

I wonder if there is a better way to put two data.frames into one, treating the rownames as if it was a column and then merge by this column with some other data.frame. I know I could do the following

 df1$rn <- row(df1) 
 all <- merge(df1,df2, by.x="rn", by.y="some_column")

I mean this produces redundant data (rownames as column) which is not needed at all. So what´s the smarter way to do it?

+2  A: 

You can use "row.names" or 0 as the index for row names.

An example using the authors and books from merge help:

rownames(authors) <- authors$surname
merge(authors, books, by.x = "row.names", by.y = "name")
VitoshKa
+1  A: 

"A smarter way" really depends on your data, which we don't have. but

df1 <- data.frame(
    X1 = 1:10,
    id = letters[1:10]
)

df2 <- data.frame(
    X2 = 10:1,
    X3 = letters[11:20]
)
rownames(df2) <- df1$id
df2 <- df2[sample.int(10),]

cbind(df1,df2[match(df1$id,rownames(df2)),])

Edit: Vitoshka's answer is the one you're looking for. If I'd have bothered looking at the help files of ?merge, I would have known that as well...

I leave my solution here just in case somebody needs a speedy alternative to merge:

> system.time(replicate(1000,cbind(df1,df2[match(df1$id,rownames(df2)),])))
   user  system elapsed 
   0.57    0.00    0.57 
> system.time(replicate(1000,merge(df1,df2,by.x="id",by.y="row.names")))
   user  system elapsed 
   2.36    0.02    2.37 
Joris Meys
shame on me, I look at the helpfile, but impatiently only at the examples. sorry for that. thx for the help though! And it´s really interesting to see the difference. Maybe assuming that there was ONE smart way to do it, was the bigger mistake than not looking at the help.
ran2
@ran2: As a general remark: once you've been in Perl-land, you know there's always more than one way to do it. And when back in R land, you'll soon realize there are also ways one could but shouldn't do it.
Joris Meys