tags:

views:

89

answers:

2

merge is a very nice function: It merges matrices and data.frames, and returns a data.frame.

Having rather big character matrices, is there another good way to merge - without data.frame conversion?


Comment 1: A small function to merge a named vector with a matrix or data.frame. Elements of the vector can link to multiple entries in the matrix:

expand <- function(v,m,by.m,v.name='v',...) {
  df <- do.call(rbind,lapply(names(v),function(x) {
    pos <- which(m[,by.m] %in% v[x])
    cbind(x,m[pos,],...)
  }))
  colnames(df)[1] <- v.name
  df
}

Example:

v <- rep(letters,each=3)[seq_along(letters)]
names(v) <- letters
m <- data.frame(a=unique(v),b=seq_along(unique(v)),stringsAsFactors=F)
expand(v,m,'a')
+1  A: 

No, not without either (a) overwriting the merge function or (b) creating a new merge.matrix() S3 function (this would be the right approach to the problem).

You can see in the merge help:

Value

A data frame.

Also, the merge.default function:

> merge.default
function (x, y, ...) 
merge(as.data.frame(x), as.data.frame(y), ...)
Shane
Yes, I like the idea of adding a methods to the generic `merge()` function.
Stephen
+2  A: 

You can use a combination of match and cbind to do the equivalent of merge without conversion to data frame, a simple example:

st1 <- state.x77[ sample(1:50), ]
st2 <- as.matrix( USArrests )[ sample(1:50), ]

tmp1 <- match(rownames(st1), rownames(st2) )

st3 <- cbind( st1, st2[tmp1,] )
head(st3)

Keeping track of which columns you want, and merging whith many to 1 relationships or missing rows in one group require a bit more thought but are still possible.

Greg Snow
Yes, AFAIK this is probably the most straightforward way to implement a matrix-merge, and you can probably create a function that merges the common columns into a key string to which you can apply `match()` if there are more than 1 columns to merge by.
Stephen
match() and %in% are the way to go. Thanks.