views:

2751

answers:

11

I am taking over some applications from a previous developer. When I run the applications through Eclipse, I see the memory usage and the heap size increase a lot. Upon further investigation, I see that they were creating an object over-and-over in a loop as well as other things.

I started to go through and do some clean up. But the more I went through, the more questions I had like "will this actually do anything?"

For example, instead of declaring a variable outside the loop mentioned above and just setting its value in the loop... they created the object in the loop. What I mean is:

for(int i=0; i < arrayOfStuff.size(); i++) {
    String something = (String) arrayOfStuff.get(i);
    ...
}

versus

String something = null;
for(int i=0; i < arrayOfStuff.size(); i++) {
    something = (String) arrayOfStuff.get(i);
}

Am I incorrect to say that the bottom loop is better? Perhaps I am wrong.

Also, what about after the second loop above, I set "something" back to null? Would that clear out some memory?

In either case, what are some good memory management best practices I could follow that will help keep my memory usage low in my applications?

Update:

I appreciate everyones feedback so far. However, I was not really asking about the above loops (although by your advice I did go back to the first loop). I am trying to get some best practices that I can keep an eye out for. Something on the lines of "when you are done using a Collection, clear it out". I just really need to make sure not as much memory is being taken up by these applications.

+16  A: 

Don't try to outsmart the VM. The first loop is the suggested best practice, both for performance and maintainability. Setting the reference back to null after the loop will not guarantee immediate memory release. The GC will do it's job best when you use the minimum scope possible.

Books which cover these things in detail (from the user's perspective) are Effective Java 2 and Implementation Patterns.

If you care to find out more about performance and the inners of the VM you need to see talks or read books from Brian Goetz.

cherouvim
+5  A: 

Those two loops are equivalent except for the scope of something; see this question for details.

General best practices? Umm, let's see: don't store large amounts of data in static variables unless you have a good reason. Remove large objects from collections when you're done with them. And oh yes, "Measure, don't guess." Use a profiler to see where the memory is being allocated.

Michael Myers
+1 solely for "Measure, don't guess."
matt b
+2  A: 

The two loops will use basically the same amount of memory, any difference would be negligible. "String something" only creates a reference to an object, not a new object in itself and thus any additional memory used is small. Plus, compiler / combined with JVM will likely optimize the generated code anyway.

For memory management practices, you should really try to profile your memory better to understand where the bottlenecks actually are. Look especially for static references that point to a big chunk of memory, since that will never get collected.

You can also look at Weak References , and other specialized memory management classes.

Lastly, keep in mind, that if an application takes up memory, there might be a reason for it....

Update The key to memory management is data structures, as well as how much performance you need / when. The tradeoff is often between memory and CPU cycles.

For example, a lot of memory can be occupied by caching, which is specifically there to improve performance since you are trying to avoid an expensive operation.

So think through your data structures and make sure you don't keep things in memory for longer than you have to. If it's a web app, avoid storing a lot of data into the session variable, avoid having static references to huge pools of memory, etc.

Jean Barmash
+4  A: 

The first loop is better. Because

  • the variable something will be clear faster (theoretical)
  • the program is better to read.

But from point of memory this is irrelevant.

If you have memory problems then you should profile where it is consumed.

Horcrux7
+7  A: 

There are no objects created in both of your code samples. You merely set an object reference to a string that is already in the arrayOfStuff. So memorywise there is no difference.

Kees de Kooter
+1  A: 

Well, the first loop is actually better, because the scope of something is smaller. Regarding memory management - it makes not a big difference.

Most Java memory problems come when you store objects in a collection, but forget to remove them. Otherwise the GC makes his job quite good.

siddhadev
+1  A: 

The first example is fine. There isn't any memory allocation going on there, other than a stack variable allocation and deallocation each time through the loop (very cheap and quick).

The reason is that all that is being 'allocated' is a reference, which is a 4 byte stack variable (on most 32 bit systems anyway). A stack variable is 'allocated' by adding to a memory address representing the top of the stack, and so is very quick and cheap.

What you need to be careful of is for loops like:

for (int i = 0; i < some_large_num; i++)
{
   String something = new String();
   //do stuff with something
}

as that is actually doing memory allocatiton.

workmad3
There is not even the allocation on the stack for each iteration. These are allocated in the method local variable table (one time) yet the variable automatically goes out-of-scope when the loop ends.
Kevin Brock
yeah, but a new String object is created each iteration which is more expensive than a stack allocation.
workmad3
+4  A: 

The JVM is best at freeing short-lived objects. Try not to allocate objects you don't need. But you can't optimize the memory usage until you understand your workload, the object lifetime, and the object sizes. A profiler can tell you this.

Finally, the #1 thing you must avoid doing: never use Finalizers. Finalizers interfere with garbage collection, since the object can't be just freed but must be queued for finalization, which may or may not occur. It's best to never use finalizers.

As for the memory usage you're seeing in Eclipse, it's not necessarily relevant. The GC will do its job based on how much free memory there is. If you have lots of free memory you might not see a single GC before the app is shut down. If you find your app running out of memory then only a real profiler can tell you where the leaks or inefficiencies are.

Mr. Shiny and New
+1  A: 

If you haven't already, I suggest installing the Eclipse Test & Performance Tools Platform (TPTP). If you want to dump and inspect the heap, check out the SDK jmap and jhat tools. Also see Monitoring and Managing Java SE 6 Platform Applications.

McDowell
+3  A: 

In my opinion, you should avoid micro-optimizations like these. They cost a lot of brain cycles, but most of the time have little impact.

Your application probably has a few central data structures. Those are the ones you should be worried about. For example, if you fill them preallocate them with a good estimate of the size, to avoid repeated resizing of the underlying structure. This especially applies to StringBuffer, ArrayList, HashMap and the like. Design your access to those structures well, so you don't have to copy a lot.

Use the proper algorithms to access the data structures. At the lowest level, like the loop you mentioned, use Iterators, or at least avoid calling .size() all the time. (Yes, you're asking the list every time around for it's size, which most of the time doesn't change.) BTW, I've often seen a similar mistake with Maps. People iterate over the keySet() and get each value, instead of just iterating over the entrySet() in the first place. The memory manager will thank you for the extra CPU cycles.

Ronald Blaschke
+2  A: 

As one poster above suggested, use profiler to measure the memory (and/or cpu) usage of certain parts of your program rather than trying to guess it. You may be surprised at what you find!

There is an added benefit to that as well. You'll understand about your programming language and your application more.

I use VisualVM for profiling and recommend it greatly. It comes with jdk/jre distribution.

riz