I have some old java code for a REST service that uses a separate thread for every incoming request. I.e. the main loop would loop on socket.accept() and hand off the socket to a Runnable which then would start up its own background thread and invoke run on itself. This worked admiringly well for a while until recently i noticed that the lag of accept to processing the request would get unacceptable under high load. When i say admiringly well, i mean that it was handling 100-200 requests a second without significant CPU usage. The performance only degraded when other daemons were adding load as well and then only once load exceeded 5. When the machine was under high load (5-8) from a combination of other processes, the time from accept to processing would get ridiculously high (500ms to 3000ms) while the actual processing stayed sub-10ms. This is all on dual-core centos 5 systems.
Having been used to Threadpools on .NET, i assumed that thread creation was the culprit and i thought i'd apply the same pattern in java. Now my Runnable is executed with ThreadPool.Executor (and the pool uses and ArrayBlockingQueue). Again, it works great under most scenarios unless the machine load gets high, then the time from creating the runnable until run() is invoked exhibits about the same ridiculous timing. But worse, the system load nearly doubled (10-16) with the threadpool logic in place. So now i get the same latency problems with double the load.
My suspicion is that the lock contention of the queue is worse than the previous new thread start-up cost that had no locks. Can anyone share their experience of new thread vs. threadpool. And if my suspicion is correct, anyone have an alternative approach to dealing with a threadpool without lock contention?
I'd be tempted to just make the whole system single-threaded since i don't know how much my threading helps and IO doesn't seem to be an issue, but I do get some requests that are long-lived that would then block everything.
thanks, arne
UPDATE: I switched over to Executors.newFixedThreadPool(100);
and while it maintained the same processing capacity, load pretty much immediately doubled and running it for 12 hours showed load staying consistently at 2x. I guess in my case a new thread per request is cheaper.