tags:

views:

308

answers:

2
+2  Q: 

Undo command in R

I can't find something to the effect of an undo command in R (neither on An Introduction to R nor in R in a Nutshell). I am particularly interested in undoing/deleting when dealing with interactive graphs.

What approaches do you suggest?

Thanks,

Roberto

+12  A: 

You should consider a different approach which leads to reproducible work:

  • Pick an editor you like and which has R support
  • Write your code in 'snippets', ie short files for functions, and then use the facilities of the editor / R integration to send the code to the R interpreter
  • If you make a mistake, re-edit your snippet and run it again
  • You will always have a log of what you did

All this works tremendously well in ESS which is why many experienced R users like this environment. But editors are a subjective and personal choice; other people like Eclipse with StatET better. There are other solutions for Mac OS X and Windows too, and all this has been discussed countless times before here on SO and on other places like the R lists.

Dirk Eddelbuettel
I'd just like to offer my two cents for this. My preferred editor is Eclipse + StatEt, but as Dirk says, choice of editor is subjective....
PaulHurleyuk
I like to use Gedit + RGedit. (never managed to install StatET under Fedora unfortunately). So now we have 4 cents ;)
nico
If you're a windows user, you can always try Tinn-R. I don't use it myself but have heard great things about it. We now have 6c. :)
Roman Luštrik
8 cents: gvim + vim r plugin 2 http://www.vim.org/scripts/script.php?script_id=2628
Andreas
10 cents: If you're a mac user Textmate + R bundle is a pleasure to use! Textmate is proprietary but it worths the price!
Paolo
+4  A: 

In general I do adopt Dirk's strategy. You should aim for your code to be a completely reproducible record of how you have transformed your raw data into output.

However, if you have complex code it can take a long time to re-run it all. I've had code that takes over 30 minutes to process the data (i.e., import, transform, merge, etc.). In these cases, a single data-destroying line of code would require me to wait 30 minutes to restore my workspace. By data destroying code I mean things like:

  • x <- merge(x, y)
  • df$x <- df$x^2

e.g., merges, replacing an existing variable with a transformation, removing rows or columns, and so on. In these cases, it's easy, especially when first learning R to make a mistake.

To avoid having to wait this 30 minutes, I adopt several strategies:

  • If I'm about to do something where there's a risk of destroying my active objects, I'll first copy the result into a temporary object. I'll then check that it worked with the temporary object and then rerun replacing it with the proper object. E.g., first run temp <- merge(x, y); check that it worked str(temp); head(temp); tail(temp) and if everything looks good x <- merge(x, y)
  • As is common in psychological research, I often have large data frames with hundreds of variables and different subsets of cases. For a given analysis (e.g., a table, a figure, some results text), I'll often extract just the subset of cases and variables that I need into a separate object for the analysis and work with that object when preparing and finalising my analysis code. That way, I'm less likely to accidentally damage my main data frame. This assumes that the results of the analysis does not need to be fed back into the main data frame.
  • If I have finished performing a large number of complex data transformations, I may save a copy of the core workspace objects. E.g., save(x, y, z , file = 'backup.Rdata') That way, If I make a mistake, I only have to reload these objects.
  • df$x <- NULL is a handy way of removing a variable in a data frame that you did not want to create

However, in the end I still run all the code from scratch to check that the result is reproducible.

Jeromy Anglim