Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.
I am new to R but figure that there has to be a fairly quick way to do this.
Many thanks,
Brock
Is it possible to row bind two data frames that don't have the same set of columns? I am hoping to retain the columns that do not match after the bind.
I am new to R but figure that there has to be a fairly quick way to do this.
Many thanks,
Brock
No, it is not possible.
rbind() and cbind() require matching dimensions along the side chosen to combine by.
You can use smartbind() from the gtools package.
Example:
library(gtools)
df1 <- data.frame(a = c(1:5), b = c(6:10))
df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
smartbind(df1, df2)
# result
a b c
1.1 1 6 <NA>
1.2 2 7 <NA>
1.3 3 8 <NA>
1.4 4 9 <NA>
1.5 5 10 <NA>
2.1 11 16 A
2.2 12 17 B
2.3 13 18 C
2.4 14 19 D
2.5 15 20 E
No, I don't think so. rbind requires the same number of columns in each matrix or data frame. From the help page:
If there are several matrix arguments, they must all have the same
number of columns (or rows) and this will be the number of columns
(or rows) of the result. If all the arguments are vectors, the
number of columns (rows) in the result is equal to the length of
the longest vector. Values in shorter arguments are recycled to
achieve this length (with a ‘warning’ if they are recycled only
_fractionally_).
And proof:
> a <- rnorm(3)
> b <- rnorm(3)
> c <- rnorm(3)
> df.one <- data.frame(a, b, c)
> df.two <- data.frame(a, b)
> rbind(df.one, df.two)
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
But you can do a hackish recycling of the smaller data frame and rename the columns:
> df.three <- cbind(df.two, df.two[, 1:(ncol(df.one) - ncol(df.two))])
> colnames(df.three) <- colnames(df.one)
> rbind(df.one, df.three)
a b c
1 -2.499236596 0.08539973 0.070122711
2 -1.304782366 0.44049636 -0.848588975
3 0.005446522 0.36805686 -0.251105213
4 -2.499236596 0.08539973 -2.499236596
5 -1.304782366 0.44049636 -1.304782366
6 0.005446522 0.36805686 0.005446522
Darn it! Beat by Dirk! Maybe this recycling will help anyways...
You could also just pull out the common column names.
> cols <- intersect(colnames(df1), colnames(df2))
> rbind(df1[,cols], df2[,cols])
rbind.fill from the package plyr might be what you are looking for.
If the columns in df1 is a subset of those in df2 (by column names):
df3 <- rbind(df1, df2[,names(df1)]
Maybe I completely misread your question, but the "I am hoping to retain the columns that do not match after the bind" makes me think you are looking for a left join or right join similar to an SQL query. R has the merge function that lets you specify left, right, or inner joins similar to joining tables in SQL.
There is already a great question and answer on this topic here: http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right