views:

314

answers:

6

Our Tomcat web application feels slow when it is used by a couple hundred users. The servers are in a hosting company and their reports doesn't show any problem with bandwith or cpu workload, so I suspect that the reason of the slowdowns can be because of contention on some legacy code that we encapsulated under synchronized calls because it was the easier path.

I've done some artificial tests in the developement environment changing the synchronized calls with a ThreadLocal solution and it becomes faster, but I know that my boss will require me some evidence that it will also be faster in production.

How can I know for sure if thread contention is a problem in my app?

A: 

I could add logging in our synchronized calls like this

//...
long t0 = System.currentTimeMillis();
synchronized(lockObj){
    logger.info("T sync :" + (t0 - System.currentTimeMillis()));
    //...
}

but this feels cheap and dirty.

Serhii
You should use nanoTime(), not currentTimeMillis(), if you want to get significant figures.
Nicolas Simonet
Be aware that this kind of logging >>could<< alter your system's behaviour; e.g. if a call to System.currentTimeMillis() or logger.info(...) causes a context switch.
Stephen C
+3  A: 

If you have a modified version you think is faster, test it by using some load tester (e.g. JMeter) to test both versions. If there is a significant difference, you'll have the results to prove it.

Rich Seller
Thanks, but we don't have a proper integration environment with the same setup as production to run this kind of tests. Runing JMeter tests in developement will be the same as the tests I've already done.
Serhii
+1  A: 

Thare are a bunch of open source java profilers at your disposal, as well as others that might cost money, like YourKit. You should run tests with both the existing code and your enhanced code. The work with ThreadLocals should reduce contention in general, but consider that it is also good to do the benchmarking before you begin optimization.

Another pretty simple test that can be done without setting up any profiler is to take a few thread dumps (ctrl-break or kill -QUIT) while the app appears slow. Spotting a few threads waiting on similar or same monitors over a short peroid of time might point out the slow spots pretty clearly. You can use tools like TDA, a Java thread dump analyzer to help you comb through the thread dump.

Again, doing this work before you begin optimizing is a good idea. This is because, although there may be some obvious places where optimizations could make a difference, actual user behavior might trigger paths that the developer doesnt consider, and these might turn out to be the real problem areas.

akf
I've profiled parts of the application before with Eclipse TPTP and Netbeans, but profiling the entire app becomes very slow so I've never tried to profile while doing a load test because it will take hours, if it doesn't crash.
Serhii
Yeah, It can be laborious work trying to put a large app under a profiler. The tool Michael Borgwardt suggests looks promising.
akf
A: 

Your analysis sound reasonable. Can you attach e.g. visualvm (in the JDK) to the processes so you can see where the time is spent?

Thorbjørn Ravn Andersen
+7  A: 
Michael Borgwardt
+1  A: 
jstack PID

will print out a listing of the status of the JVM with process id PID, along with information on thread statuses.

sample output (excerpt):

"AWT-XAWT" daemon prio=10 tid=0x0000000000e5f800 nid=0x476d runnable [0x00007f1a75616000..0x00007f1a75616bf0]
   java.lang.Thread.State: RUNNABLE
    at sun.awt.X11.XToolkit.waitForEvents(Native Method)
    at sun.awt.X11.XToolkit.run(XToolkit.java:543)
    at sun.awt.X11.XToolkit.run(XToolkit.java:518)
    at java.lang.Thread.run(Thread.java:636)

"Java2D Disposer" daemon prio=10 tid=0x0000000000d8b800 nid=0x476c in Object.wait() [0x00007f1a759df000..0x00007f1a759dfc70]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00007f1a82e2c3f8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
    - locked <0x00007f1a82e2c3f8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
    at sun.java2d.Disposer.run(Disposer.java:143)
    at java.lang.Thread.run(Thread.java:636)

I'd also try to isolate the resource under contention. e.g. if the legacy library is locking to syncrhonize writes to a database maybe you'd miniimize the writes.

Steve B.