tags:

views:

116

answers:

2

Could you provide some insight into the techniques that you use to ensure the quality of your solutions. For example, sometimes, I like to test my result using stopifnot() to ensure I'm not receiving ridiculous results. Are there any other techniques or functions that you use in data processing to ensure that you're receiving the solution you meant to?

Note: I realize that this is a broad question and perhaps a candidate for community wiki or even closure, but rather than voting to close, perhaps assist me by adding comments to direct the conversation.

+3  A: 

Just a few things that come to mind (in random order)

  • This page has very interesting link for debugging in R (ok this is during production, but still related to your issue I think)

  • You can use exceptions, as explained in this discussions (and links therein)

  • You can write tests with known results (both for success and failure) and see that they actually do what they are supposed to do. Be sure to pass some weird data to the functions and see how they behave in a "not-so-normal" situation.

  • Don't just rely on automated tests: give your functions to a fairly computer illiterate person at work (not enough that he/she can't use R though!) and let him/her do some beta tests. You'll be amazed at the quantity of errors he/she will come up with!!! :)

nico
+2  A: 

Quality in software engineering is quite a massive area, and most of it applies to code written in R as much as code written in Cobol or C#, so my first answer would be 'it depends'.

For me, I come from the Pharmaceutical Industry, where what we do is regulated by government agencies like the FDA and the MHRA. For us, Quality is something we think about throughout the process so I would list the following as visible artifacts of quality;

  • We have a software development process, that's written down and repeatable (traditionally in this kind of industry this is a waterfall style, but more and more agile / prototyping style methodologies are being used)
  • We have a system that ensures every person involved knows what they should be doing (job descriptions) and is suitably qualified to do that job (training)
  • We start by defining what is required in some way, hopefully in some way that can be tested
  • We have some way of documenting our development process, where we've been and how (a combination of good documentation and Source Control)
  • We do testing wherever possible, and as early as possible (so, automated if possible)
  • We have people who are responsible for overseeing Quality, who are separate from people who are doing to prevent conflicts
  • We control the software environment that is used for development, testing and production (read; change control)
  • We control and manage software once it is in use, tracking issues and managing them (Issue Tracking)
  • We keep records, so that even if every person involved went under a bus / won the lottery the new people could still defend and prove everything above to a government inspector.

However, that's a big list, and I imagine their are lots of industries that don't do all of them (finance, education) and probably some who do more (building nuclear reactors, saving lives, NASA).

More specifically to what i assume you're getting at, before you code you should be able to define some specific starting input's and the answers you should get out, and I recommend you use something like RUnit or Testthat to build these in.

PaulHurleyuk