tags:

views:

53

answers:

2

I'm trying to make sense of how to sort one data.frame based on multiple columns in another. This question does this with vectors. Can someone suggest a way to do the equivalent with data.frames?

Here's some sample data.

x1 <- data.frame(a=1:5, b=letters[1:5], c=rnorm(5))
x2 <- data.frame(a=c(4,4,2), b=c("d", "d", "b"), d=rnorm(3))

So I want to sort x2 by the first two columns of x1. My actual data is much more complicated, but this replicates the idea...

+2  A: 

Attach a rank column to the relevant columns of x1:

len <- dim(x1)[1]
x1. <- cbind(x1[,1:2], rank=1:len)

Merge into x2 (this is like a SQL join; see the merge documentation for how to specify what happens if there are ambiguities such as multiple matches or no matches):

x2. <- merge(x2, x1.)

Sort:

x2.[order(x2.[,'rank']),]
Jouni K. Seppänen
+2  A: 

It really depends on what your data really looks like. As it looks right now, you only need one column to sort, and that is easily done by:

x2[order(match(x2[,1],x1[,1])),]

If you need more than one column, this becomes a bit trickier. You will have to specify which one you want to sort first on, and which one second, eg :

x1 <- data.frame(a=rep(1:3,2), b=rep(letters[2:4],each=2), c=rnorm(6))
x2 <- data.frame(a=c(3,3,2), b=c("c", "d", "b"), d=rnorm(3))


x2[order(match(
  paste(x2[,1],x2[,2]),
  paste(x1[,1],x1[,2]))
),]

This sorts on the first column first, and then on the second. You have to keep in mind that you need all combinations in x2 also in x1. T

Joris Meys