views:

6561

answers:

6

I want to sort a dataframe by multiple columns in R. For example, with the data frame below I would like to sort by column z (descending) then by column b (ascending):

dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"), 
      levels = c("Low", "Med", "Hi"), ordered = TRUE),
      x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9),
      z = c(1, 1, 1, 2))
dd
    b x y z
1  Hi A 8 1
2 Med D 3 1
3  Hi A 9 1
4 Low C 9 2
+6  A: 

With this (very helpful) function by Kevin Wright, posted in the tips section of the R wiki, this is easily achieved.

> sort(dd,by = ~ -z + b)
    b x y z
4 Low C 9 2
2 Med D 3 1
1  Hi A 8 1
3  Hi A 9 1
Christopher DuBois
Link to this function is http://rwiki.sciviews.org/doku.php?id=tips:data-frames:sort
Marek
+12  A: 

You can use the order() function directly without resorting to add-on tools -- see this simpler answer which uses a trick right from the top of the example(order) code:

R> dd[with(dd, order(-z, b)), ]
    b x y z
4 Low C 9 2
2 Med D 3 1
1  Hi A 8 1
3  Hi A 9 1
Dirk Eddelbuettel
+2  A: 

Alternatively, using the package Deducer

library(Deducer)
dd<- sortData(dd,c("z","b"),increasing= c(FALSE,TRUE))
Ian Fellows
+3  A: 

or you can use package doBy

library(doBy)
dd <- orderBy(~-z+b, data=dd)
gd047
Awesome package, I hadn't seen it before.
Ken Williams
+1  A: 

if SQL comes naturally to you, sqldf handles ORDER BY as Codd intended.

mjm
MJM, thanks for pointing out this package. It's incredibly flexible and because half of my work is already done by pulling from sql databases it's easier than learning much of R's less than intuitive syntax.
Brandon Bertelsen
A: 

In my R-profile, I have a function that makes sorting like this significantly easier.

esort <- function(x, sortvar, ...) {
attach(x)
x <- x[with(x,order(sortvar,...)),]
return(x)
detach(x)
}

For your example above it would be a nice and easy:

esort(dd, -z, b)
Brandon Bertelsen