views:

246

answers:

3
+4  A: 
subset(data,!duplicated(data$ID))

Should do the trick

James
thanks ever so much - I was about to despair...
CatholicEvangelist
This will work if you don't have any heuristic in mind for how to select the other data. Seems like a very strange use case to me...
Shane
Exactly what I just needed James, thank you.
Tal Galili
+2  A: 

If you want to keep one row for each ID, but there is different data on each row, then you need to decide on some logic to discard the additional rows. For instance:

df <- data.frame(ID=c(1, 2, 2, 3), time=1:4, OS="Linux")
df
  ID time    OS
1  1    1 Linux
2  2    2 Linux
3  2    3 Linux
4  3    4 Linux

Now I will keep the maximum time value and the last OS value:

library(plyr)
unique(ddply(df, .(ID), function(x) data.frame(ID=x[,"ID"], time=max(x$time), OS=tail(x$OS,1))))
  ID time    OS
1  1    1 Linux
2  2    3 Linux
4  3    4 Linux
Shane
thanks a lot for the detailed answer!!!
CatholicEvangelist
A: 

Hi Shane,

Could you possibly describe what the function is doing? I have a similar problem to CatholicEvangelist, but it is a bit more complex and I think understanding yours would be helpful.

Thanks, Lauren

Lauren