tags:

views:

48

answers:

1

Hello, I am trying to manipulate a conditional string outputted from SAS into the right format for a conditional statement in R. Here is an example of the conditional outputted from SAS:

. < var1_a<=80 and var2_a>50.8

I've written a function that handles some of the transformation necessary:

conditonalsub <- function(x) {
subnew <- gsub("<=", " <= ", x)
subnew <- gsub(">=", " >= ", subnew)
subnew <- gsub(">", " > ", subnew)
subnew <- gsub("and", "&", subnew)
subnew <- gsub("\\.\\s", "NA ", subnew)
return(subnew)

which produces the following string:

NA < var1_a <= 80 & var2_a > 50.8

I am using these conditional statements to subset the observations of a data frame. So in this example I want R to select all observations with var1_a values that are either missing or less than or equal to 80 AND have var2_a greater than 50.8. How can I modify the above function so that I get a conditional statement that is able to take missing values like the var1_a portion of the conditional statement above? My guess is the format of the new conditional statement would look something like this?

(var1_a == NA | var1_a <= 80) & (var2_a > 50.8) 
A: 

This is not a true answer, but I think this problem is more complicated than what you present.

  1. Missing values act somewhat strange in SAS. For comparisons, they are equivalent to negative infinity. So a missing value is smaller than any non-missing number, but not smaller than a missing number. So the . < var1_a<=80 statement is written this way to avoid selecting missing values, and not include them. This also means that the real problem is with inoccuous looking statements such as a<10 which will evaluate to TRUE in SAS if a is missing, but not so in R.

  2. On the other hand, the 2 < a < 4 syntax for getting the values between 2 and 4 is allowed in SAS, but not in R, so you will have to find a way to detect this and all its variations.

  3. Depending how general you want to get, you have to recode the alternative ways SAS can denote comparisons as well (EQ, NE, GEQ, etc).

So unless your set of SAS logical statements has very restricted syntax, you will have lots of trouble.

Aniko
Thank you Aniko. Since the conditional statement for var1_a in the example above avoids selecting missing values, I will just remove that portion (. < ) from the conditional statement. All of the other conditionals from my SAS program have the syntax x < a or x > a.
sheed03