views:

122

answers:

2

I have a graph structure in java, ("graph" as in "edges and nodes") and I'm attempting to serialise it. However, I get "StackOverflowException", despite significantly increasing the JVM stack size. I did some googling, and apparently this is a well known limitation of java serialisation: that it doesn't work for deeply nested object graphs such as long linked lists - it uses a stack record for each link in the chain, and it doesn't do anything clever such as a breadth-first traversal, and therefore you very quickly get a stack overflow.

The recommended solution is to customise the serialisation code by overriding readObject() and writeObject(), however this seems a little complex to me.

(It may or may not be relevant, but I'm storing a bunch of fields on each edge in the graph so I have a class JuNode which contains a member ArrayList<JuEdge> links;, i.e. there are 2 classes involved, rather than plain object references from one node to another. It shouldn't matter for the purposes of the question).

My question is threefold:
(a) why don't the implementors of Java rectify this limitation or are they already working on it? (I can't believe I'm the first person to ever want to serialise a graph in java)
(b) is there a better way? Is there some drop-in alternative to the default serialisation classes that does it in a cleverer way?
(c) if my best option is to get my hands dirty with low-level code, does someone have an example of graph serialisation java source-code that can use to learn how to do it?

A: 

There is something weird here. Do you mean you have an obejct graph in memory and the fact of calling the serialization code generates the StackOverflowException ? If so, it means your graph involves lots of lazy loading elements that are not propagated to serialization.

In other words, it is clear to me that your application is already optimized due to its massive size. But, unfortunatly, you didn't took advantage of those optimizations in your serialization code, preferring to directly serialize your root object, which involved loading all children and raisong that exception.

I would really strongly suggest you use your existing optimizations by instead implementing the Externalizable interface, which is in fact the same thing than implementing readObject/writeObject, but in a more OO way.

As an aside, to allow yourself an access to your object written data, consider using XMLEncoder/XMLDecoder, which work like serialization, but produces "readable" XML.

Riduidel
I'm not sure how `Externalizable` is more OO, considering what serialisation is. `XMLEncode` I would stay away from.
Tom Hawtin - tackline
The readObject/writeObject pair of method is defined outside any interface. in fact, there is an "implicit" interface where these methods are defined, but it is defined nowhere. Contrary to the Externalizable interface, which clearly defines the contract of readExternal/writeExternal, ensuring that someone who implements it will have to write the required methods, that do exactly the same thing than readObject/writeObject, but in an explicit fashion.
Riduidel
Can you please explain why you would stay away from XMLEncoder ?
Riduidel
YES - "I mean I have an obejct graph in memory and the fact of calling the serialization code generates the StackOverflowException". But NO - I don't use any lazy loading elements. There is about 50Mb worth of data involved...not particularly large.
Tim Cooper
+1  A: 

Although it could be optimised, the Java serialisation spec is fundamentally recursive. You can, and often will, provide a writeObject (and readObject) method. Whilst that is executing, referenced objects have to be written. Even if breadth-first traversal was possible, it wouldn't help.

The Sun/Oracle JDK is open sourced, and is open to contributions.

java.util.LinkedList will have an example of how to efficiently serialise a linked list.

Tom Hawtin - tackline
Why wouldn't breadth-first traversal help? It would solve the stack overflow problem, which is my only problem. Do you mean instead "breadth-first would not be possible according to the spec"?
Tim Cooper
The problem is fundamentally recursive. If you took the recursion bit out and kept state off the program state, then traversal order is irrelevant.
Tom Hawtin - tackline