views:

80

answers:

4

Currently I use a HashMap<Class, Set<Entry>>, which may contain several millions of short-lived and long-lived objects. (Entry is a wrapper class around an Object and an integer, which is a duplicate count).

I figured: these Objects are all stored in the JVM's Heap. Then my question popped in my mind; instead of allocating huge amounts of memory for the HashMap, can it be done better (less memory consumption)?

Is there a way to access Objects in the Java Heap indirectly, based on the Class of the Objects?

With "indirectly" I mean: without having a pointer to the Object. With "access" I mean: to retrieve a pointer to the Object in the heap.

A: 

HashMap overhead shouldn't be that much. And I don't think it's possible to rummage around in the heap with the public Java API. The Objects probably won't be there anyway, as they will be collected if there is no reference.

What you could do, if HashMap overhead is to much, is allocate an array, like Object[] or Entry[]. You'll loose the quick-access, add and delete possibilities of course (given that an array is fixed size it's hard to add items if the array is too small).

When using the array solution you'll have to know beforehand how many entries you will have, or copy the array in a larger array when needed, take null values into account if you allow deletes and so forth. What an ArrayList does, basically.

extraneon
+2  A: 

I don't really understand the purpose of your code, but I fear your code involves some frequent OutOfMemoryError, no ?

Anyway.

You can get references to obejcts that won't prevent these objects from being garbage-collected, like SoftReference (the default one used when you do myObj = thisObj;), WeakReference, and PhantomReference.

So, you could (and should, inf act, use WeakReference to let the GC do its work). However, for dynamic memory exploration, there are applications already existing, like VisualVM, that uses a protocol allowing an external process to query the VM : JVMPI.

i think you really should take a look at this.

Riduidel
A: 

The map contains only pointers to the objects on the heap. I do not think you can do better then that,

David Soroko
It has a tiny bit of bookkeeping which involves the hash buckets. Which should be trivial for anything other than a Map of pure Integers of course.
extraneon
Right, this should be negligible compared to the memory taken by millions of objects.
David Soroko
I've seen the source-code for `HashMap`: it will allocate an array in a power of 2. If I have 131,073 `Object`s in the `HashMap`, it has, internally, an array of 262,144 references. I found this alarming.
Pindatjuh
@pindatjuh if you have 131,073 elements in the Map, there are 2x references required! There's simply nothing you can do about that. If you allow multiple values for a key, a MultiMap will help reduce the size. But for that amount of data that is used as a lookup, store it in a database like sqlite and rely on caching to keep frequently accessed values/blocks in memory.
nicerobot
@nicerobot thank you, for your suggestion about a database like SQLLite, I will research this possibility.
Pindatjuh
+1  A: 

No. Basically, each object knows its class, but a class does not know all its objects - it's not necessary for the way the JRE works and would only be useless overhead.

Why do you need to know all instances of those classes anyway? Maybe there's a better way to solve your actual problem.

Michael Borgwardt
Good question: I'm working on an implementation of a Lookup (like the Netbeans Platform's Lookup but then more advanced/performing better).
Pindatjuh
Netbeans' Lookup API is intended for system components. That's a completely different granularity than classes and individual objects, and no realistic system is going to have millions of components.
Michael Borgwardt
Well, that's why I am implementing one myself. I'm using the paradigm of components for developing games. There are a lot (around 10,000) of small short-lived objects during rendering, with peaks of a million. This is because I am designing an agent-based rendering technique for rendering large areas, which will be shrinked and processed in parallel. Thank you for your answer.
Pindatjuh