ansaurus

Question

Subsetting a data frame based on contents of another data frame

Answer 1

+3 A:

Both %in% and match() can be used for this. Here is the former:

> which( df1$x %in% df2$y )
 [1]   1   2   3   4  27  28  29  30  53  54  55  56  79  80  81  82 105
[18] 106 107 108 131 132 133 134 157 158 159 160 183 184 185 186 209 210
[35] 211 212 235 236 237 238 261 262 263 264 287 288 289 290 313 314 315
[52] 316 339 340 341 342 365 366 367 368 391 392 393 394
> 
>
> table(df1[ which( df1$x %in% df2$y ), "x"])

 a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v  w  x  y 
16 16 16 16  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 
 z 
 0 
>

Dirk Eddelbuettel 2010-08-10 18:36:26

df1[ which( df1$x %in% df2$y ), "x"] <-- thanks!

Brandon Bertelsen 2010-08-10 19:21:09

You can drop the `which` as you can index directly with a vector of booleans -- so `df1[ df1$x %in% df2$y , "x"]` is shorter. I like `which()` as I sometimes want just the indices to make sure I get the correct interim results.

Dirk Eddelbuettel 2010-08-10 19:49:46

ansaurus

tags:

views:

answers:

Subsetting a data frame based on contents of another data frame

related questions