views:

219

answers:

6
  1. Can the JVM recover from an OutOfMemoryError without a restart if it gets a chance to run the GC before more object allocation requests come in?

  2. Do the various JVM implementations differ in this aspect?

EDIT: My question was about the JVM recovering and not the user program trying to recover by catching the error. In other words if an OOME is thrown in an application server (jboss/websphere/..) do I have to restart it? Or can I let it run if further requests seem to work without a problem.

Sorry if that wan't clear.

+1  A: 

The JVM will run the GC when it's on edge of the OutOfMemoryError. If the GC didn't help at all, then the JVM will throw OOME.

You can however catch it and if necessary take an alternative path. Any allocations inside the try block will be GC'ed.

Since the OOME is "just" an Error which you could just catch, I would expect the different JVM implementations to behave the same. I can at least confirm from experience that the above is true for the Sun JVM.

See also:

BalusC
A: 

You can increase your odds of recovering from this scenario although its not recommended that you try. What you do is pre-allocate some fixed amount of memory on startup thats dedicated to doing your recovery work, and when you catch the OOM, null out that pre-allocated reference and you're more likely to have some memory to use in your recovery sequence.

I don't know about different JVM implementations.

Amir Afghani
+1  A: 
Yishai
+1 for "any sane JVM".
Gnarly
*"Generally frameworks that run other code, like application servers, attempting to continue in the face of an OME makes sense"*. I disagree, unless the framework is extremely robust, attempting to recover from an OOME can result in (for example) a catatonic server. Been there, seen that!
Stephen C
@Stephen C, I shudder to think what calling System.exit(1) on any OME in JBoss would look like. Every time a user tired to read too much data, everyone goes down. I agree that it can lead to problems, but the most likely cause of an OME for an app server is user code doing too much, and as long as they catch it at a point where the user code allocations are no longer reachable, full-recovery is the most likely outcome and worth coding for, IMO.
Yishai
@Yishai - a bad request (e.g. user tried to read too much data) should not be allowed to cause an OOME in the first place. The correct fix is to make the request processing more defensive ... not to try to recover from OOMEs.
Stephen C
@Stephen C, the author of an application server doesn't have that option.
Yishai
@Yishai - yes he/she does. Just provide a way for ding-bat application developers / deployers to enable dodgy OOME recovery.
Stephen C
@Stephen C, in other words recover from it ;).
Yishai
@Yishai - well *try to* recover from it. As I said in my example, it is difficult to know if an OOME recovery has really worked.
Stephen C
+5  A: 

It may work, but it is generally a bad idea. There is no guarantee that your application will succeed in recovering, or that it will know if it has not succeeded. For example:

  • There really may be not enough memory to do the requested tasks, even after taking recovery steps like releasing block of reserved memory. In this situation, your application may get stuck in a loop where it repeatedly appears to recover and then runs out of memory again.

  • The OOME may be thrown on any thread. If an application thread or library is not designed to cope with it, this might leave some long-lived data structure in an incomplete or inconsistent state.

  • If threads die as a result of the OOME, the application may need to restart them as part of the OOME recovery. At the very least, this makes the application more complicated.

  • Suppose that a thread synchronizes with other threads using notify/wait or some higher level mechanism. If that thread dies from an OOME, other threads may be left waiting for notifies (etc) that never come ... for example. Designing for this could make the application significantly more complicated.

In summary, designing, implementing and testing an application to recover from OOMEs can be difficult, especially if the application (or the framework in which it runs, or any of the libraries it uses) is multi-threaded. It is a better idea to treat OOME as a fatal error.

See also my answer to a related question:

EDIT - in response to this following question

In other words if an OOME is thrown in an application server (jboss/websphere/..) do I have to restart it?

No you don't have to restart. But it is probably wise to, especially if you don't have a good / automated way of checking that the service is running correctly.

The JVM will recover just fine. But the application server and the application itself may or may not recover, depending on how well they are designed to cope with this situation. (And my experience is that some are not designed to cope with this.)

Stephen C
A: 

Can it recover? Possibly. Any well-written JVM is only going to throw an OOME after it's tried everything it can to reclaim enough memory to do what you tell it to do. There's a very good chance that this means you can't recover. But...

It depends on a lot of things. For example if the garbage collector isn't a copying collector, the "out of memory" condition may actually be "no chunk big enough left to allocate". The very act of unwinding the stack may have objects cleaned up in a later GC round that leave open chunks big enough for your purposes. In that situation you may be able to restart. It's probably worth at least retrying once as a result. But...

You probably don't want to rely on this. If you're getting an OOME with any regularity, you'd better look over your server and find out what's going on and why. Maybe you have to clean up your code (you could be leaking or making too many temporary objects). Maybe you have to raise your memory ceiling when invoking the JVM. Treat the OOME, even if it's recoverable, as a sign that something bad has hit the fan somewhere in your code and act accordingly. Maybe your server doesn't have to come down NOWNOWNOWNOWNOW, but you will have to fix something before you get into deeper trouble.

JUST MY correct OPINION
A: 

I'd say it depends partly on what caused the OutOfMemoryError. If the JVM truly is running low on memory, it might be a good idea to restart it, and with more memory if possible (or a more efficient app). However, I've seen a fair amount of OOMEs that were caused by allocating 2GB arrays and such. In that case, if it's something like a J2EE web app, the effects of the error should be constrained to that particular app, and a JVM-wide restart wouldn't do any good.

Adam Crume