tags:

views:

180

answers:

3

I've started to use R a little while ago and am not sure how often to update the installed packages (at this time, I'm using mostly ggplot2 and rattle). One one hand it's the typical geek impulse to have the latest version :-) On the other, updates can break functionality and, as an R beginner, I don't want to waste time looking into package incompatibilities and reinstalling libraries, it's almost certain I wouldn't notice any difference with an improved package.

With other applications I have a sense developed from experience on how often to upgrade, how much to wait between the release of an upgrade and installing it and so on. But I'm in the dark with regards to R.

And to be clear: I'm not talking about R itself, but its libraries.

Thanks.

+5  A: 

Yes it is.

Why exactly would you want to hang on to old bugs and lacking features?

Dirk Eddelbuettel
Well, one reason is that sometimes a new release introduces new bugs and breaks features. I'm less concerned about esoteric features (which I'm not going to use) but more about package dependencies. If library X depends on Y v3.1 and upgrade Y to v3.2, is it possible for X to not work at all any more? Much like the "DLL hell" in Windows?
wishihadabettername
CRAN has rigorous testing. You can trust it. This is one of the reasons it has been successful.
Dirk Eddelbuettel
Great then, I wasn't very aware about their testing. Thanks.
wishihadabettername
@Dirk: While I agree with your answer, I would also add that CRAN's testing mostly addresses the formatting/structure of the package (which should catch OS issues, etc.); it doesn't (it couldn't) enforce anything about the content. Actually testing that falls to the package author, and there is an immense amount of variability there.
Shane
I continue to disagree rather strongly. In 10+ years of CRAN use I don't think I have been bitten once by something that was truly harming other components. The overall bug rate is very low. In fact, I think so highly of it that I helped (twice) to build a robot to turn all packages into .deb package (now at [cran2deb](http://debian.cran.r-project.org). I also immediately update the `r-cran-*` Debian (source) packages I look after. But maybe that is because upgrades are more robust on my OS of choice than on the other two leading consumer brands. Shrug.
Dirk Eddelbuettel
Actually I have coworkers who very intentionally don't update regularly, if they're in the middle of a project. Changing the version of R or packages could cause conflicting results. Mind you this isn't the case when there's an overt bug (e.g. gcrma a couple of years ago), but there's a strong need to keep the environment stable during a project.
geoffjentry
The question was NOT about R so let's stay on topic here. But in that context I note the desperation of R Core who offer alpha, beta and rc releases --- which seemingly nobody cares to touch. Yet then everybody swarms in and complains about bugs once the .0 version is released. I have the feeling people want to eat their cake and have it too: free open source software that is bug free yet insist on not participating in finding bugs. See a problem here?
Dirk Eddelbuettel
What I'm referring to in my comment are instances such as when the formatting of an object changes (for instance), where a column name has changed or a field is removed. I have experienced this, and it has broken dependencies for me; it was caught in my own unit tests and I could resolve it fairly quickly, but it's not something that I think you should do in a production environment without testing your own dependencies first. This doesn't relate to OS issues, but to package specific content. So, I agree that upgrading often is the right thing to do, but just add "with your own testing".
Shane
I am somewhat with you on "production" environments. However, that was NOT a conditioning variable in the original question which was on the general context of R usage. And there upgrades are good for you in general. Particular cases may require particular treatment.
Dirk Eddelbuettel
Dirk: I said "R or packages". See my other response. I work with people who need to keep their R environment (and *that includes packages*) stable for the duration of a project.
geoffjentry
I'll also note it took me a long time to get used to that practice, coming from RG's group where if you weren't working with the latest checkout of everything something was wrong with you :)
geoffjentry
Thanks for the clarification, and yes, I am in the same camp as the Hutch folks :)
Dirk Eddelbuettel
+5  A: 

Here is my philosophy: the naïve user never updates. The sophisticated user always updates. The power user updates often, but carefully.

Mindless updating is not always beneficial. Bugs work their way in updated versions of R libraries (or R itself!), and you could break your existing code by updating without reading the change log or commit history. For example, R 2.11 broke lme4 on OS X... it pays to carefully update and run demos of packages between releases. It really sucks to update to a new library or R release and realize something broke when you have a deadline.

Vince
That's why you have rollbacks provided by the `Archive/` directories on CRAN.
Dirk Eddelbuettel
Yes, but checking beforehand is a lot less effort than rolling back.
Vince
Let me know where you park the time machine that tells ex-ante what would have broken had you installed the package(s).
Dirk Eddelbuettel
It's called Google. Unless everyone literally updates to the newest release of R (or a package) at the exact same time, chances are someone has had an issue and sent it to the R mailing list. Being aware of issues helps. I really don't see how Googling releases/issues or looking at change logs before updating is a bad practice.
Vince
Also, upgrade only one or few packages at a time. It makes it more difficult to find where the bug is coming from if you just updated 50 packages and something is not working (not impossible, just more difficult).
nico
You guys astonish me. What ever happened to "release early, release often"? As a CRAN author I am stunned by this. Oh well, chacun a son gout.
Dirk Eddelbuettel
@Dirk Eddelbuettel: Dirk, sometimes you just can't afford that. A while ago I updated some packages, and everything seemed to work well. Then my boss asked me to analyse some new data, so I opened an R script I use for this task and it wasn't working anymore. The problem was with one of the packages I updated (`tcltk`). I managed to analyse the data without that script, but still, it was annoying. I'm more of a "update often, if you're not in the middle of analysing data for a paper" kind of person.
nico
+1  A: 

Yes, unless you have a good reason not to (see my comment to Dirk)

geoffjentry
Where you missed was the original question was about.
Dirk Eddelbuettel
@Dirk, in @geoffjentry's defense he did state "Changing the version of R **or packages** could cause conflicting results..." That may have been an ex post edit, but it does not seem totally off base.
JD Long
Dirk - if you're not updating your R, you are unlikely to be updating versions of packages, particularly with things like BioC where BioC releases are directly tied to R releases. Also, not updating their packages is the primary reason they're not updating.
geoffjentry