ansaurus

Question

R: Select subset of dataframe by non-unique ids

Answer 1

+6 A:

Use

df[df$id %in% v,]

Shane 2010-04-02 19:59:51

Answer 2

+4 A:

This should do what you want:

ndx = which(df$id %in% v)
df[ndx,]

doug 2010-04-02 20:00:55

Beat you by 30 seconds. :)

Shane 2010-04-02 20:02:56

clearly what's needed on SO is a handicap clock for the experts, say 45 seconds or so your answers to sit on the server before posting--though most of the time even even that won't help me. :)

doug 2010-04-02 20:17:28

awesome. +1 for both

amarillion 2010-04-02 20:23:42

I'm just waiting for marek to come by and tell us that we're forgetting about NA values...

Shane 2010-04-02 20:44:59

In practice, to deal with na values in v and in the id column, I used this: df[df$id %in% v[!is.na(v)],].

amarillion 2010-04-02 20:54:24

Here I come... Actually `%in%` is doing well with `NA`. It will match `NA` as any other value (no mater of which of two vectors contains `NA's`). In other words `NA %in% NA` returns `TRUE`.

Marek 2010-04-06 07:56:42

@Marek: ha! I was starting to get worried.

Shane 2010-04-18 02:54:10

ansaurus

tags:

views:

answers:

R: Select subset of dataframe by non-unique ids

related questions