ansaurus

Question

R - Sorting and Sub-setting Maximum Values within Columns

Answer 1

A:

One way would be to use order with ddply from the package plyr

> library(plyr)
> d<-data.frame(occu=rep(letters[1:5],2),city=rep(c('A','B'),each=5),val=1:10)
> ddply(d,.(city),function(x) x[order(x$val,decreasing=TRUE)[1:3],])

order can sort on multiple columns if you want that.

Jyotirmoy Bhattacharya 2010-07-23 06:56:12

Answer 2

A:

This will output the max for each city. Similar results can be obtained using sort or order

# Generate some fake data
codes <- paste("Code", 1:100, sep="")
values <- matrix(0, ncol=20, nrow=100)
for (i in 1:20)
    values[,i] <- sample(0:100, 100, replace=T)

df <- data.frame(codes, values)

names(df) <- c("Code", paste("City", 1:20, sep=""))

# Now for each city we get the maximum
maxval <- apply(df[2:21], 2, which.max)
# Output the max for each city
print(cbind(paste("City", 1:20), codes[maxval]))

nico 2010-07-23 07:06:57

Answer 3

A:

I'm not exactly sure what your desired output is according to your example snippit. Here's how you could get a data frame like that for every city using plyr and reshape

#using the same df from nico's answer
library(reshape)
df.m <- melt(df, id = 1)
a.cities <- cast(df.m, codes ~ . | variable)

library(plyr)
a.cities.max <- aaply(a.cities, 1, function(x) arrange(x, desc(`(all)`))[1:4,])

Now, a.cities.max is an array of data frames, with the 4 largest values for each city in each data frame. To get one of these data frames, you can index it with

a.cities.max$X13

I don't know exactly what you'll be doing with this data, but you might want it back in data frame format.

df.cities.max <- adply(a.cities.max, 1)

JoFrhwld 2010-07-23 16:47:09

I think that's it!

AzadA 2010-07-23 20:52:21

ansaurus

tags:

views:

answers:

R - Sorting and Sub-setting Maximum Values within Columns

related questions