EDIT: additional info at the end
I've got a (large) Java Swing application that's demonstrating some very strange behavior. If the computer running it is left idle (no mouse or keyboard input) for a sufficiently long time (varies between an hour or so and a couple of days), the Swing display sometimes stops updating totally (we've got, among other things, a clock shown on-screen that stops updating) until the user moves the mouse. Once the mouse has been moved, our application seems to function normally. (Creating and removing a window from another application also causes the display to start updating again; keyboard input doesn't appear to be sufficient.)
We're running Sun's JDK 1.6.0_07 on Linux kernel 2.6.25.14 (I think it's a modified RHEL 4 distro, but I'm not sure offhand) running xorg-x11-server 1.1.1-48.13.e15.
When in this state, the AWT Event Queue is always "Runnable", and in one of a few java2d methods - the most recent example I have is:
at sun.java2d.loops.Blit.Blit (native method)
at sun.java2d.pipe.DrawImage.blitSurfaceData
at sun.java2d.pipe.DrawImage.renderImageCopy
at sun.java2d.pipe.DrawImage.copyImage
at sun.java2d.pipe.DrawImage.copyImage
at sun.java2d.pipe.ValidatePipe.copyImage
at sun.java2d.SunGraphics2D.drawImage
at sun.java2d.SunGraphics2D.drawImage
at <our code>
And the stack trace from GDB for that thread looks like:
in poll()
in XAddConnectionWatch()
in _XRead()
in _XReply()
in XSync()
in X11SD_GetRasInfo()
in Java_sun_java2d_loops_Blit_Blit
in ??
In addition, our application usually has a couple of threads that render to VolatileImages in the background. When in this state, these threads are always RUNNABLE but stuck in calls like:
at sun.java2d.loops.FillRect.FillRect (Native Method)
at sun.java2d.pipe.LoopPipe.fillRect
at sun.java2d.SunGraphics2D.fillRect
at sun.java2d.SunGraphics2D.clearRect
at <our code: rendering to a VolatileImage>
and the GDB stack trace for these threads is:
in pthread_cond_wait@@GLIBC_2.3.2
in Monitor::wait
in GC_locker::jni_lock_slow
in jni_GetPrimitiveArrayCritical
in BufImg_GetRasInfo
in Java_sun_java2d_loops_FillRect_FillRect
in ??
Has anyone seen anything like this before? We're completely stumped, and I'm not even sure what to do next to try and pin down the problem.
EDIT: the problem continues. The AWT Event Queue basically always looks the same, on both the JStack and gdb stack dumps; we've seen this happen with no other threads getting simultaneously stuck as I described initially.
Thanks!