(sun's) GC scans live objects. the assumption is that there are way more dead objects than live objects in a typical java program runtime. it marks live objects, and dispose the rest.
if you cache a lot of objects, they are all live. and if you have several GBs of such objects, GC is going to waste a lot of time scanning them in vain. long GC pauses can paralyze your application.
cache something just to make it non-garbage is not helping GC.
that's not to say caching is wrong. if you have 15G memory, and your database is 10G, why not cache everything in memory, so responses are lighting fast. note this is to cache something that would otherwise be slow to fetch.
to prevent GC from fruitlessly scanning the 10G cache, the cache must be outside GC's control. For example, use 'memcached" which lives in another process, and has its own cache-optimized GC.
the latest news is Terracotta's BigMemory which is a pure java solution that does similar thing.
an example of thread local pooling is sun's direct ByteBuffer pooling. when we call
channel.read(byteBuffer)
if byteBuffer is not "direct", a "direct" one must be allocated under the hood, used to communicate data with OS. in a network application, such allocations could be very frequent, it seems to be a waste, to discard a just allocated one, and immediately allocate another one in the next statement. sun's engineers, apparently don't trust GC that much, created a thread local pool of "direct" ByteBuffers.