tags:

views:

344

answers:

3

I have a dataframe which contains (among other things) a numeric column with a concentration, and a factor column with a status flag. This status flag contains NA's.

Here's an example

df<-structure(list(conc = c(101.769, 1.734, 62.944, 92.697, 25.091, 27.377, 24.343, 55.084, 0.335, 23.280), status = structure(c(NA, NA, NA, NA, NA, NA, 2L, NA, 1L, NA), .Label = c("<LLOQ", "NR"), class = "factor")), .Names = c("conc", "status"), row.names = c(NA, -10L), class = "data.frame")

I want to replace the concentration column with a string for some values of the flag column, or with the concentration value formatted to a certain number of significant digits.

When I try this

ifelse(df$status=="NR","NR",df$conc)

The NA's in the status flag don't trigger either the true or false condition (and return NA) - as the documentation suggests it will. I could loop over the rows and use IF then else on each one but this seems inefficient.

Am I missing something ? I've tried as.character(df$status) as well which doesn't work. My mojo must be getting low....

+2  A: 

How about testing for missingness:

ifelse(is.na(df$status), df$conc, as.character(df$status))
Aniko
+3  A: 

You must explicit test for NA so you can use:

ifelse(df$status=="NR" | is.na(df$status),"NR",df$conc) # gives you NR for NA

or

ifelse(df$status=="NR" & !is.na(df$status),"NR",df$conc) # gives you df$conc for NA
Marek
+3  A: 

Use %in% instead of == :

ifelse(df$status %in% "NR","NR", df$conc)

Side-by-side comparison of the two methods:

data.frame(df, ph = ifelse(df$status=="NR","NR",df$conc), mp = ifelse(df$status %in% "NR","NR",df$conc))

Check out ?match for more information - I'm not sure I could explain it well.

Matt Parker
perfect - that works really well. Thanks Matt
PaulHurleyuk