views:

51

answers:

1

I use work manager to do database synchronization in several university to core banking: the sync will start every 5 minutes until completed.

but I've got an error:

ThreadMonitor W WSVR0605W: Thread "WorkManager.DefaultWorkManager : 1250" (00001891) has been active for 1009570 milliseconds and may be hung. There is/are 2 thread(s) in total in the server that may be hung.

This error causes the database sync to rollback automatically.

I found some documentation here: http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.express.doc/info/exp/ae/ttrb_confighangdet.html

ThreadMonitor always monitors the active thread, and after the thread is active for more than N milliseconds than set in alarm threshold, ThreadMonitor always gives the above error message. However, I notice all my sync operations take longer than N to complete.

My question is, does ThreadMonitor just report a warning when the active thread runs more than N milliseconds (i.e., it's a hung thread) or does ThreadMonitor also kill hung threads?

+4  A: 

ThreadMonitor simply monitors the threads which are active beyond a threshold time.

This should serve as warnings to the WAS administrators that some thread is using a lot of time to process (which might be genuine or otherwise)

The ThreadMonitor will not kill the thread.

In many cases, it might genuinely take a long time to process (depending on what it does) so the ThreadMonitor simply restricts itself to identifying potentially hung threads and leaves the actual job of finding out what the thread is doing (based on thread dumps and locating the specific ThreadID)

The threshold time can be configured for your servers if you desire to have a different value from the default.

HTH Manglu

Manglu