tags:

views:

110

answers:

2

I have a data frame with two columns (data will not always be identical).

1 1 
2 2 
3 3 
0 0 
-1 -1 
-2 -2 
-3 -3

What I would like to do is create another column for the top 10% of the column and the bottom 10% of the column to be used as labels for a scatter plot.

1 1 
2 2 
3 3 1
0 0 
-1 -1  
-2 -2 
-3 -3 2

In addition, it needs to be able to select and label from either column the top/bottom 10%

Any ideas?

+3  A: 

Your question is a bit ambiguous. What does "of the scale to be used in jpeg outputs." mean? Are both columns always identical? Perhaps you are looking for something like the following:

> dat<-data.frame(a=c(-(1:3),0:3))
> low<-quantile(dat$a,.1)
> high<-quantile(dat$a,.9)
> dat$flag<-NA
> dat$flag[dat$a<=low]<-1
> dat$flag[dat$a>high]<-2
> dat
   a flag
1 -1   NA
2 -2   NA
3 -3    1
4  0   NA
5  1   NA
6  2   NA
7  3    2
Ian Fellows
A: 

Thank you for the response Ian, I realize the question itself wasn't very well formed but I was having difficulty explaining what I wanted. With your assistance, I've been able to put it together:

top <- subset(data, data$column > quantile(data$column, 0.85))    
bottom <- subset(data, data$column < quantile(data$column, 0.15))
listing <- rbind(top,bottom)
label <- 1:nrow(listing)
listing[sort.list(listing$Distance, decreasing=T),]
Brandon Bertelsen