views:

145

answers:

4

Question: I am testing functions in a package that I am developing and would like to know if you can suggest some general guidelines for how to do this. The functions include a large range of statistical modeling, transformations, subsetting, and plotting. Is there a 'standard' or some sufficient test?

An Example: the test that prompted me ask this question,

The function dtheta:

dtheta <- function(x) {
  ## find the quantile of the mean
  q.mean <- mean(mean(x) >= x)
  ## find the quantiles of ucl and lcl (q.mean +/- 0.15)
  q.ucl  <- q.mean + 0.15
  q.lcl  <- q.mean - 0.15
  qs <- c(q.lcl, q.mean, q.ucl)
  ## find the lcl, mean, and ucl of the vector
  c(quantile(x,qs), var(x), sqrt(var(x))/mean(x))
}

Step 1: make test data:

set.seed(100) # per Dirk's recommendation
test <- rnorm(100000,10,1)

Step 2: compare the expected output from the function with the actual output from the function:

 expected <- quantile(test, c(0.35, 0.65, 0.5))
 actual   <- dtheta(test)[1:3]
 signif(expected,2) %in% signif(actual,2)

Step 3: maybe do another test

test2 <- runif(100000, 0, 100)
expected <- c(35, 50, 65)
actual   <- dtheta(test2)
expected %in% signif(actual,2)

Step 4: if true, consider function 'functional'

+5  A: 

Nice question.

Besides generalities such as setting a seed, I would recommend that you look at some of the tests in the R sources. The directory tests/ in the source has a wealth of these; some of the packages in R Base (such as tools) also have subdirectory tests/.

Dirk Eddelbuettel
Excellent resource. Thanks for the pointer. I added set.seed() to my code and will be sure to start using it. Do you use it for analysis or just testing and examples?
David
Without a fixed seed you have no reproducibility making comparisons, debugging, ... a tad harder ;-)
Dirk Eddelbuettel
You probably already know about [runit](http://cran.r-project.org/web/packages/RUnit/index.html)
VitoshKa
I'm guessing that he does, given that he uses it in his packages...
Shane
+5  A: 

It depends on what exactly you want to test. Next to Dirks recommendations, and the runit package VitoshKa mentioned, I'd like to add a few things :

  • Indeed, set the seed, but make sure you try the function with different seeds as well. Some functions fail only once every ten times you try. Especially when optimization is involved, this becomes crucial. replicate() is a nice function to use in this context.
  • Think very well about the input you want to test. You should test a number of "odd" cases that don't really resemble the "perfect" dataset. I always test at least 10 (simulated) datasets of different sizes.
  • Fool-proof the function: I also throw in some data types that are not the ones the function is meant for. Wrong type input is likely going to happen at one point, and the last thing you want is a function returning a bogus result without a warning. If you use that function later on in some other code, debugging that code can and will! be hell. Been there, done that, bought the t-shirt...

An example on extended testing of datasets: what would you like to see as output in these cases? Is this the result you'd expect? Not according to the test you did.

> test3 <- rep(12,100000) # data with only 1 value
> expected <- c(12, 12, 12)
> actual   <- dtheta(test3) 
Error in quantile.default(x, qs) : 'probs' outside [0,1]

>  test4 <- rbinom(100000,30,0.5) # large dataset with a limited amount of values
>  expected <- quantile(test4,c(0.35, 0.50, 0.65))
>  actual   <- dtheta(test4)
>  expected %in% signif(actual,2)
[1] FALSE  TRUE  TRUE

> test5 <- runif(100,0,100) # small dataset. 
> expected <- c(35, 50, 65)
> actual   <- dtheta(test5)
> expected %in% signif(actual,2)
[1] FALSE FALSE FALSE

edit : corrected code so tests are a bit more senseful.

Joris Meys
+1 Joris for testing input. I would add to test the *output* as well. The output of your functions must be *predictable* and *precisely defined*. Unfortunately on this point, R's basic functionality sometimes just sucks. Never know exactly what the function returns unless you go to documentation again, and again, and again...
VitoshKa
+3  A: 

You need to write

  1. tests that show you get the right answer when you input sensible values

  2. tests that show your function fails correctly when you input nonsense.

  3. test for all boundary cases

There is a huge amount of literature on different strategies for testing software; Wikipedia's software testing page is as good a place as any to start.

Looking at your example:

What happens when you input a string/dataframe/list?
What about negative x or imaginary x?
How about vector/array x?
If only positive x is allowed, then what happens at x = 0?

Note that subfunctions (that are only called by your functions and never by the user) need less input checking because you have more control over what goes into the function.

Richie Cotton
+3  A: 

It's already appeared as a comment, but I'll add it as a bona fidey answer. R does have a few automated testing packages to help with this kind of thing, the main two being Runit and testthat. I've briefly used runit, and recently started using testthat in more depth (I can't really give any good advantages / disadvantages of one over another though !).

Automated testing allows you to setup these test cases, as well as others as suggested above like;

  • Boundary Tests
  • Stress Tests (less need to test for accuracy, just throw data at it and see if it falls over)
  • Dealing with different input
  • Dealing with different underlying platforms / locales
PaulHurleyuk