ansaurus

Question

Forcing garbage collection to run in R with the gc() command

Answer 1

+2 A:

"Probably." I do it too, and often even in a loop as in

cleanMem <- function(n=10) { for (i in 1:n) gc() }

Yet that does not, in my experience, restore memory to a pristine state.

So what I usually do is to keep the tasks at hand in script files and execute those using the 'r' frontend (on Unix, and from the 'littler' package). Rscript is an alternative on that other OS.

That workflow happens to agree with

which we covered here before.

Dirk Eddelbuettel 2009-09-23 17:00:01

That's very useful. Thank you.

JD Long 2009-09-23 17:42:12

Answer 2

+1 A:

"Maybe." I don't really have a definitive answer. But the help file suggests that there are really only two reasons to call gc():

You want a report of memory usage.
After removing a large object, "it may prompt R to return memory to the operating system."

Since it can slow down a large simulation with repeated calls, I have tended to only do it after removing something large. In other words, I don't think that it makes sense to systematically call it all the time unless you have good reason to.

Shane 2009-09-23 17:13:42

That's pretty much how I have been using it. I call it when I remove a great big data frame or something like that. Thanks for the input.

JD Long 2009-09-23 17:41:21

Answer 3

+2 A:

No. If there is not enough memory available for an operation, R will run gc() automatically.

hadley 2009-09-23 22:02:19

Answer 4

+2 A:

From the help page on gc:

A call of 'gc' causes a garbage collection to take place. This will also take place automatically without user intervention, and the primary purpose of calling 'gc' is for the report on memory usage.

However, it can be useful to call 'gc' after a large object has been removed, as this may prompt R to return memory to the operating system.

So it can be useful to do, but mostly you shouldn't have to. My personal opinion is that it is code of last resort - you shouldn't be littering your code with gc() statements as a matter of course, but if your machine keeps falling over, and you've tried everything else, then it might be helpful.

By everything else, I mean things like

Writing functions rather than raw scripts, so variables go out of scope.
Emptying your workspace if you go from one problem to another unrelated one.
Discarding data/variables that you aren't interested in. (I frequently receive spreadsheets with dozens of uninteresting columns.)

Richie Cotton 2009-09-24 13:55:33

ansaurus

tags:

views:

answers:

Forcing garbage collection to run in R with the gc() command

related questions