tags:

views:

145

answers:

7

Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.

I am new to R but figure that there has to be a fairly quick way to do this.

Many thanks,

Brock

+2  A: 

No, it is not possible.

rbind() and cbind() require matching dimensions along the side chosen to combine by.

Dirk Eddelbuettel
+4  A: 

You can use smartbind() from the gtools package.

Example:

library(gtools)
df1 <- data.frame(a = c(1:5), b = c(6:10))
df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
smartbind(df1, df2)
# result
     a  b    c
1.1  1  6 <NA>
1.2  2  7 <NA>
1.3  3  8 <NA>
1.4  4  9 <NA>
1.5  5 10 <NA>
2.1 11 16    A
2.2 12 17    B
2.3 13 18    C
2.4 14 19    D
2.5 15 20    E
neilfws
A: 

No, I don't think so. rbind requires the same number of columns in each matrix or data frame. From the help page:

 If there are several matrix arguments, they must all have the same
 number of columns (or rows) and this will be the number of columns
 (or rows) of the result.  If all the arguments are vectors, the
 number of columns (rows) in the result is equal to the length of
 the longest vector.  Values in shorter arguments are recycled to
 achieve this length (with a ‘warning’ if they are recycled only
 _fractionally_).

And proof:

> a <- rnorm(3)
> b <- rnorm(3)
> c <- rnorm(3)
> df.one <- data.frame(a, b, c)
> df.two <- data.frame(a, b)
> rbind(df.one, df.two)
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match

But you can do a hackish recycling of the smaller data frame and rename the columns:

> df.three <- cbind(df.two, df.two[, 1:(ncol(df.one) - ncol(df.two))])
> colnames(df.three) <- colnames(df.one)
> rbind(df.one, df.three)
             a          b            c
1 -2.499236596 0.08539973  0.070122711
2 -1.304782366 0.44049636 -0.848588975
3  0.005446522 0.36805686 -0.251105213
4 -2.499236596 0.08539973 -2.499236596
5 -1.304782366 0.44049636 -1.304782366
6  0.005446522 0.36805686  0.005446522

Darn it! Beat by Dirk! Maybe this recycling will help anyways...

richardh
A: 

You could also just pull out the common column names.

> cols <- intersect(colnames(df1), colnames(df2))
> rbind(df1[,cols], df2[,cols])
Jonathan Chang
+6  A: 

rbind.fill from the package plyr might be what you are looking for.

Jyotirmoy Bhattacharya
A: 

If the columns in df1 is a subset of those in df2 (by column names):

df3 <- rbind(df1, df2[,names(df1)]
Aaron Statham
A: 

Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.

There is already a great question and answer on this topic here: http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right

Chase