views:

165

answers:

3

My code is doing the following (just as an example, and the reason that I specify package path to java.lang.ref.SoftReference is to note that it's not my own implementaiton :-):

...
List<String> someData = new ArrayList<String>();
someData.add("Value1");
someData.add("Value2");
...
java.lang.ref.SoftReference softRef = new SoftReference(someData);
...
HttpSession session = request.getSession(true);
session.setAttribute("mySoftRefData", softRef);
...

and later:

...
java.lang.ref.SoftReference softRef = session.getAttribute("mySoftRefData");
if (softRef != null && softRef.get() != null) {
   List<String> someData = (List<String>)softRef.get();
   // do something with it.
}
...

Any disadvantages? Which I do not see? Thank you!

+1  A: 

The obvious disadvantage is that the list might disappear unpredictably. As the session is garbage collected after it expires anyway, I don't really see a use case for SoftReference. If the list gets considerably big (at least considerably enough to justify using a SoftReference) I'd rather suggest different storage (DB, temporary files).

sfussenegger
I'm working with an application that (in 4 managed servers) takes about 5G in session data in peak hours, slowing it down considerably. Most of those data are cached, and can be safely discarded -- if only developers cared...
Vladimir Dyuzhev
@Vladimir: Then it likely rather belongs in request or maybe application scope. If actually in application scope, then use Java cache frameworks like Ehcache or Terracotta.
BalusC
Not at all. Those are all bank user's data. It's just the backend (HP NonStop) is so busy that it's better to keep all query results in memory -- in case (quite often case) when same data are needed again.
Vladimir Dyuzhev
EHCache has no benefit for sticky sessions (all requests of a user come to the same managed server) and require setup complications (multicast IP).
Vladimir Dyuzhev
(In fact, EHCache would increase total memory consumption in case of non-sticky sessions -- due to duplicated data on every node).
Vladimir Dyuzhev
@Vladimir: I would maybe consider a "local" DB mirror (btw: you can edit comments within 5 mins by the `edit` link (or does it require more reps?)).
BalusC
Local DB is one of the designs considered for the next major release. It though creates a concern of data sync between main DB and local copy. I'd say disposable cache is a simpler solution. PTE will show...
Vladimir Dyuzhev
@Vladimir if all data really is specific to a single user session (not only a single user) it seems okay to cache it in the session - okay, not recommended). otherwise, as @BalusC mentioned before, I would always use a single cache for all queries (i.e. all sessions). To name some advantages: It avoids duplication of data, simplifies cache invalidation and simply is mature and comes out of the box with most persistence frameworks.
sfussenegger
(thanks for edit hint :) but I feel some people don't like when comments are changed after they have answered them)
Vladimir Dyuzhev
@sfussenegger I totally aware of the benefits of distributed/application-level caches, but there are cases when they are not beneficial.
Vladimir Dyuzhev
@Vladimir ever thought about using a distributed cache like EHCache with Terracotta?
sfussenegger
OK, folks. I agree that I simply seen the OP question within my and only mine context. Therefor I'll give your answers an UP, but leave mine just to add an extra look to the problem.
Vladimir Dyuzhev
@sfussenegger (see above ;) )
Vladimir Dyuzhev
Well, if list disappears -- that's totally fine, because I can load it from the storage later. This is actually the idea -- to *allow* it to disappear if GC swipes things off. If user is active, and using this part of application, which requires access to this part - great! chances are that GC will not pick it up :-)Do you think that keeping ArrayList without SoftReference "wrapper" in HttpSession would do better? I doubt, just because it may run into OutOfMemory problems... because GC will not swipe them. Am I missing something?
alexeypro
@alexeypro well, to me it just doesn't feel right - yes, that's a valid reason ;) - to simply relay on the GC to manage your caches. If you'd use a centralized cache, you have much better control: how many items are cache, what items are discarded, ... If you simply want to cache as many items as possible, I'd really recommend looking at Terracotta with EHCache - it's distributed network heap has an unlimited size (overflows to disk) which could make it a nice choice for this kind of problem. After all, Terracotta is using it's DB relieving capabilities as one of its main selling points.
sfussenegger
A: 

It's a very good idea ti put data that are not 100% needed for the application into a disposable cache. During peak hours they will be discarded, saving more resources for more pressing needs.

Way to go, in short.

Vladimir Dyuzhev
I wouldn't call session scoped caches the way to go.
sfussenegger
Possibly, I'm wrong. Mind to elaborate?
Vladimir Dyuzhev
That is correct, Vladimir, it's just one part of application. User may stay logged in to the application, keeping session active, but not using *this* particular part of application. So, with many users we allow GC to keep us out of OutOfMemory problems. Besides, as I mentioned before, there is another storage where is actual list is stored. If I get null - not a biggie, just loading it from the storage.
alexeypro
@sfussenegger -- I would really like to hear why. It's not that I am insisting on my solution - I always open to see my disadvantages and pitfalls. Point please. That's why I asked this question.
alexeypro
@alexeypro see the ton of comments to my answer
sfussenegger
+1  A: 

If you don't reference it anywhere else in the code yourself and the JVM has run the Garbage Collector, then you may risk that the reference won't be in the session anymore. The chance is however little, less than when using a weak reference, but still, it is there.

I wouldn't do that in a webapplication. If it is pure session scoped data (e.g. logged-in user, shopping cart, etc), then just put it in the session scope the normal way. If the session expires or invalidates, then anything which is not referenced anywhere else will be garbage collected at any way. The session scope is not intented to act as a "soft" cache. Or if it is actually request scoped data, then rather store it in the request scope. Else use another kind of data store.

BalusC
There is another data storage layer used. So if I'll get null, I just load it from the storage. My idea is to keep it "wrapped" in SoftReference just to allow that GC cleans things up if too many users are logged in and some are not using this part of application, but still are within active session.
alexeypro
Have a look at Ehcache or Terracotta.
BalusC