views:

307

answers:

3

Conventional wisdom tells us that high-volume enterprise java applications should use thread pooling in preference to spawning new worker threads. The use of java.util.concurrent makes this straightforward.

There do exist situations, however, where thread pooling is not a good fit. The specific example which I am currently wrestling with is the use of InheritableThreadLocal, which allows ThreadLocal variables to be "passed down" to any spawned threads. This mechanism breaks when using thread pools, since the worker threads are generally not spawned from the request thread, but are pre-existing.

Now there are ways around this (the thread locals can be explicitly passed in), but this isn't always appropriate or practical. The simplest solution is to spawn new worker threads on demand, and let InheritableThreadLocal do its job.

This brings us back to the question - if I have a high volume site, where user request threads are spawning off half a dozen worker threads each (i.e. not using a thread pool), is this going to give the JVM a problem? We're potentially talking about a couple of hundred new threads being created every second, each one lasting less than a second. Do modern JVMs optimize this well? I remember the days when object pooling was desirable in Java, because object creation was expensive. This has since become unnecessary. I'm wondering if the same applies to thread pooling.

I'd benchmark it, if I knew what to measure, but my fear is that the problems may be more subtle than can be measured with a profiler.

Note: the wisdom of using thread locals is not the issue here, so please don't suggest that I not use them.

+4  A: 

First of all, this will of course depend very much on which JVM you use. The OS will also play an important role. Assuming the Sun JVM (Hm, do we still call it that?):

One major factor is the stack memory allocated to each thread, which you can tune using the -Xssn JVM parameter - you'll want to use the lowest value you can get away with.

And this is just a guess, but I think "a couple of hundred new threads every second" is definitely beyond what the JVM is designed to handle comfortably. I suspect that a simple benchmark will quickly reveal quite unsubtle problems.

Michael Borgwardt
I find the notion of what `new Thread()` means to be an interesting one. In a modern JVM, `new Object()` doesn't always allocate new memory, it reuses previously garbage-collected objects. I wonder if there's any reason why the JVM couldn't have a hidden, internal pool of reusable threads, so that `new Thread()` doesn't necessarily create a new kernel thread. You'd get effective thread-pooling, without needing an API for it.
skaffman
If this is so, it should be found in some JSR. Might be 133 http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf
Bozho
+1  A: 
  • for your benchmark you can use JMeter + a profiler, which should give you direct overview on the behaviour in such a heavy-loaded environment. Just let it run for a an hour and monitor memory, cpu, etc. If nothing breaks and the cpu(s) doesn't overheat, it's ok :)

  • perhaps you can get a thread-pool, or customize (extend) the one you are using by adding some code in order to have the appropriate InheritableThreadLocals set each time a Thread is acquired from the thread-pool. Each Thread has these package-private properties:

    /* ThreadLocal values pertaining to this thread. This map is maintained
     * by the ThreadLocal class. */
    ThreadLocal.ThreadLocalMap threadLocals = null;
    
    
    /*
     * InheritableThreadLocal values pertaining to this thread. This map is
     * maintained by the InheritableThreadLocal class.  
     */ 
    ThreadLocal.ThreadLocalMap inheritableThreadLocals = null;
    

    You can use these (well, with reflection) in combination with the Thread.currentThread() to have the desired behaviour. However this is a bit ad-hock, and furthermore, I can't tell whether it (with the reflection) won't introduce even bigger overhead than just creating the threads.

Bozho
The transcription of threadlocals is something I did consider. In my particular case, however, I'm using `@Async` in Spring 3, which decouple the mechanics of the `Callable` from the business logic. It's very cool, but means you don't get access to the executor itself, or the tasks that get created.
skaffman
Did you check whether spring doesn't have some pluggable mechanism for replacing the executor impelementation? If not, then to further go hacking, you could try creating a class with the same qualified name as the one where you will eventually put your custom code, and let it be loaded instead of the original one. But that's a last resort.
Bozho
Hmmm, yes, Spring does allow you to specify the executor used for @Async, so yes, there's a way of passing across the threadlocals there, although as you said, it's still going to get pretty ugly.
skaffman
A: 

I am wondering whether it is necessary to spawn new threads on each user request if their typical life-cycle is as short as a second. Could you use some kind of Notify/Wait queue where you spawn a given number of (daemon)threads, and they all wait until there's a task to solve. If the task queue gets long, you spawn additional threads, but not on a 1-1 ratio. It will most likely be perform better then spawning hundreds of new threads whose life-cycles are so short.

Terje
What you're describing is a thread pool, which I already described in the question.
skaffman
If each Request thread acts as a ThreadPool, I guess I just don't see why you couldn't have a `private ThreadLocal<T> local;` which you instanciate each time the Request thread wakes up, and when processing each worker thread, you use `local.set()` / `local.get()`, but it's likely I misunderstand your problem.
Terje