views:

565

answers:

5

What are some good practices for programming in R?

Since R is a special-purpose language that I don't use all the time, I typically just hack together some quick scripts that do what I need.

But what are some tips for writing clean and efficient R code?

+9  A: 

I recommend Josh Reich's Load, Clean, Func, Do workflow from this previous question.

In addition I recommend following coding guidelines such as Google's R Style Guide. Using a coding style guide makes reading the code later so much easier.

JD Long
i wish the 'dot' naming convention was not endorsed in that Style Guide (e.g., some.variable.name). It has history on its side and most R code is written that way, still though, not a fan.
doug
+8  A: 

You already provide some hints by stating your approach is 'hack quick scripts'. If you want best practices and structure, simple follow the established best practices from CRAN:

  • create a package, this opens the door to running R CMD check which is very useful
  • as many people have stated, having a package helps you in the code writing stage too as you are somewhat forced to document the code; that is a Good Thing (TM)
  • once you have a package, add code in the \examples{} section of the documentation as this will be running during R CMD check and provides an easy entry to regression testing
  • once you get used to regression testing, start to use a package such as RUnit; that really is best practices
  • JD's pointer to the Google Style Guide is a good one too. That isn't the only style guide as e.g. Henrik's R Coding Convention precedes it by a few years; and there is also Hadley's riff on Google's style guide
  • Otherwise, the oldie-but-goldie 'do what your colleagues and coauthors do' also applies
Dirk Eddelbuettel
+5  A: 

I completely agree with the existing answers, especially regarding the usage of packages. Packages require a lot of discipline, documentation, and structure, which really help to enforce best practices (along with R CMD CHECK). You can also use the codetools package to help with this. Use the roxygen package for documentation.

Beyond that, I recommend that you not only vectorize your code, but more particularly, make every effort to vectorize your functions, meaning that you should be able to provide vector arguments and get vectors returned (even from things like database calls). That will really improve your code efficiency and clarity in the long run.

Lastly, I really like to use something like Sweave to organize my code into clear literate reproducible research whenever writing a report. Along with this I recommend using the cache package.

Shane
Thank you for the answer Shane.Do you have any examples for use of the "codetools" package ?
Tal Galili
A: 

For efficiency, prefer vector operations over for loops.

A: 

This is good programming practice in general, but use a version control system such as SVN manage your code.

stevejb