views:

188

answers:

2

Hi there,

I have a dataset showing the exchange rate of the Australian Dollar versus the US dollar once a day over a period of about 20 years. I have the data in a data frame, with the first column being the date, and the second column being the exchange rate. Here's a sample from the data:

>data
             V1     V2
1    12/12/1983 0.9175
2    13/12/1983 0.9010
3    14/12/1983 0.9000
4    15/12/1983 0.8978
5    16/12/1983 0.8928
6    19/12/1983 0.8770
7    20/12/1983 0.8795
8    21/12/1983 0.8905
9    22/12/1983 0.9005
10   23/12/1983 0.9005

How would I go about displaying the top n% of these records? E.g. say I want to see the days and exchange rates for those days where the exchange rate falls in the top 5% of all exchange rates in the dataset?

+6  A: 

For the top 5%:

n <- 5
data[data$V2 > quantile(data$V2,prob=1-n/100),]
Rob Hyndman
Thanks very much!
Bryce Thomas
Or to save a little typing: `subset(data, V2 > quantile(V2, prob = 1 - n/100))`
hadley
A: 

For the top 5% also:

head(data[order(data$V2,decreasing=T),],.05*nrow(data))
gd047