views:

205

answers:

1

EDIT: additional info at the end

I've got a (large) Java Swing application that's demonstrating some very strange behavior. If the computer running it is left idle (no mouse or keyboard input) for a sufficiently long time (varies between an hour or so and a couple of days), the Swing display sometimes stops updating totally (we've got, among other things, a clock shown on-screen that stops updating) until the user moves the mouse. Once the mouse has been moved, our application seems to function normally. (Creating and removing a window from another application also causes the display to start updating again; keyboard input doesn't appear to be sufficient.)

We're running Sun's JDK 1.6.0_07 on Linux kernel 2.6.25.14 (I think it's a modified RHEL 4 distro, but I'm not sure offhand) running xorg-x11-server 1.1.1-48.13.e15.

When in this state, the AWT Event Queue is always "Runnable", and in one of a few java2d methods - the most recent example I have is:

at sun.java2d.loops.Blit.Blit (native method)
at sun.java2d.pipe.DrawImage.blitSurfaceData
at sun.java2d.pipe.DrawImage.renderImageCopy
at sun.java2d.pipe.DrawImage.copyImage
at sun.java2d.pipe.DrawImage.copyImage
at sun.java2d.pipe.ValidatePipe.copyImage
at sun.java2d.SunGraphics2D.drawImage
at sun.java2d.SunGraphics2D.drawImage
at <our code>

And the stack trace from GDB for that thread looks like:

in poll()
in XAddConnectionWatch()
in _XRead()
in _XReply()
in XSync()
in X11SD_GetRasInfo()
in Java_sun_java2d_loops_Blit_Blit
in ??

In addition, our application usually has a couple of threads that render to VolatileImages in the background. When in this state, these threads are always RUNNABLE but stuck in calls like:

at sun.java2d.loops.FillRect.FillRect (Native Method)
at sun.java2d.pipe.LoopPipe.fillRect
at sun.java2d.SunGraphics2D.fillRect
at sun.java2d.SunGraphics2D.clearRect
at <our code: rendering to a VolatileImage>

and the GDB stack trace for these threads is:

in pthread_cond_wait@@GLIBC_2.3.2
in Monitor::wait
in GC_locker::jni_lock_slow
in jni_GetPrimitiveArrayCritical
in BufImg_GetRasInfo
in Java_sun_java2d_loops_FillRect_FillRect
in ??

Has anyone seen anything like this before? We're completely stumped, and I'm not even sure what to do next to try and pin down the problem.

EDIT: the problem continues. The AWT Event Queue basically always looks the same, on both the JStack and gdb stack dumps; we've seen this happen with no other threads getting simultaneously stuck as I described initially.

Thanks!

+2  A: 

Sorry, I don't work with any of the Java2D stuff so I can't be sure--but the first thing that comes to mind is one of your background threads is rendering to a live context.

If this is the case, then in very very rare cases, it can collide with the AWT thread (the ONLY thread allowed to actually render), and if this happens, you could easily get results like you are seeing--in fact, that's exactly what I'd expect.

I could be wrong, but even if you think I am, why not experiment with having your background threads actually do their tasks on a WorkerThread--it can't hurt to try it even if it screws up your performance a little during the test.

Bill K
+1 good advice. See also, http://java.sun.com/javase/6/docs/api/javax/swing/SwingWorker.html
trashgod
I don't think this is happening, unless I'm misunderstanding what constitutes a "live context". Is a Graphics from BufferedImage.getGraphics() or VolatileImage.getGraphics() ever a live context?
Sbodd
We have seen problems in our Swing app where you can easily lock up the GUI if you accidentally do some GUI operations off the AWT Event Queue in a background thread. You can get away with it at first, until it randomly locks up later. That is what this sounds like. At least, you should check this avenue for awhile. http://java.sun.com/products/jfc/tsc/articles/threads/threads1.html
David I.