views:

1185

answers:

5

What is the best approach for serializing java object graphs?

My requirements for serialization library are 1) speed of deserialization 2) size - as small as possible (smaller than in java default serialization) 3) flexibility - annotation based definitions of what has to be serialized would be nice.

the underlying file format is not important.

I looked at Protocol Buffers and XStream, but the former is not flexible enough due to the need for mapping files and the later produces big files.

Any help appreciated.

A: 

I think default Java serialisation is going to be pretty small. Can you not usefully restrict what you want to serialise via the transient keyword ? That would address your third issue (flexibility and annotations)

Brian Agnew
+1  A: 

For serialization Hessian is one of the most efficient.

This is about 2-3 times smaller and faster than Java Serialization, even using Externalizable classes.

Whichever serialization you use, you can use compression fairly easily to make the data more compact.

Beyond that you can write your own serialization. I wrote a serializer which writes to/from ByteBuffer which is about twice as fast and half the size of Hessian (about 5x faster/smaller than Java Serialization) This may be too much effort for little gain if existing serializations will do what you need. However it is as customizable as you like ;)

Peter Lawrey
What kind of a serializer did you write? Does it work for any objects, or do you need to write custom serialization code for each class? Are cyclic object references allowed?
Esko Luontola
It very much like Hessian. It can serialize any object except those which model real resources outside Java, like Threads, Sockets etc. Youc an write custom Serialization but as it uses some smart on the fly compression, custom serializers tent to be slower!
Peter Lawrey
"Are cyclic object references allowed?" - Have an open source version which doesn't support this and one I wrote another version for work which does. ;)
Peter Lawrey
A: 

For small objects, the Java serialised form is likely to be dominated by the description of the serialised classes.

You may be able to write out serialised data for commonly used classes, and then use that as a common prefix for a series of serialised streams. Note that this is very fragile, and you'll probably want to recompute and check it for each class loader instance.

Tom Hawtin - tackline
A: 

Would http://jserial.sourceforge.net/ suit your needs?

Esko Luontola
From their benchmark results it appears "Bubble" deserialization is *slower* than plain Java 1.4.2 serialization.
Peter Lawrey
In two of the benchmarks it's slower, in others it's faster. It depends on what is being serialized and anyways 1.4.2 is ancient, so you should benchmark it with your own application and environment to see if it suits you.
Esko Luontola
A: 

I second the note about usefulness of compression -- all formats compress to about the same, i.e. bigger output compresses more.

Beyond that and other recommendations, JSON with Jackson works quite well: much faster than XML (competitive with PB, Hessian) and bit more compact; much more flexible than PB, easy to integrate with client-side JS (if that matters) and easy to trouble-shoot.

StaxMan