ansaurus

Question

How does Java's serialization work and when it should be used instead of some other persistence technique?

Answer 1

+9 A:

I would personally try to avoid Java's "built-in" serialization:

It's not portable to other platforms
It's not hugely efficient
It's fragile - getting it to cope with multiple versions of a class is somewhat tricky. Even changing compilers can break serialization unless you're careful.

For details of what the actual bytes mean, see the Java Object Serialization Specification.

There are various alternatives, such as:

XML and JSON, as you've shown (various XML flavours, of course)
YAML
Facebook's Thrift (RPC as well as serialization)
Google Protocol Buffers
Hessian (web services as well as serialization)
Your own custom format

(Disclaimer: I work for Google, and I'm doing a port of Protocol Buffers to C# as my 20% project, so clearly I think that's a good bit of technology :)

Cross-platform formats are almost always more restrictive than platform-specific formats for obvious reasons - Protocol Buffers has a pretty limited set of native types, for example - but the interoperability can be incredibly useful. You also need to consider the impact of versioning, with backward and forward compatibility, etc. The text formats are generally hand-editable, but tend to be less efficient in both space and time.

Basically, you need to look at your requirements carefully.

Jon Skeet 2008-12-09 08:40:39

Answer 2

A:

I bumped into this dilemma about a month ago (see the question I asked).

The main lesson I learned from it is use Java serialization only when necessary and if there's no other option. Like Jon said, it has it's downfalls, while other serialization techniques are much easier, faster and more portable.

Yuval A 2008-12-09 08:45:28

Answer 3

A:

Serializing means that you put your structured data in your classes into a flat order of bytecode to save it.

You should generally use other techniques than the buildin java-method, it is just made to work out of the box but if you have some changing contents or changing orders in future in your serialized classes, you get into trouble because you'll cannot load them correctly.

joki 2008-12-09 08:46:02

Answer 4

+3 A:

see the Java Object Serialization Stream Protocol for a description of the file format an grammar used for serialized objects.

Personally I think the built-in serialization is acceptable to persist short-lived data (e.g. store the state of a session object between to http-requests) which is not relevant outside your application.

For data that has a longer live-time or should be used outside your application, I'd persist either into a database or at least use a more commonly used format...

Argelbargel 2008-12-09 08:47:08

I agree. It meant for something like transfer object over the wire, or activate/passivate kinda things, not for persisting objects, nor for outside usage.

Adeel Ansari 2008-12-09 09:18:29

Answer 5

+1 A:

The main advantage of serialization is that it is extremely easy to use, relatively fast, and preserves actual Java object meshes.

But you have to realize that it's not really meant to be used for storing data, but mainly as a way for different JVM instances to communicate over a network using the RMI protocol.

Michael Borgwardt 2008-12-09 09:18:59

Answer 6

A:

The advantage of Java Object Serialization (JOS) is that it just works. There are also tools out there that do the same as JOS, but use an XML format instead of a binary format.

About the length: JOS writes some class information at the start, instead of as part of each instance - e.g. the full field names are recorded once, and an index into that list of names is used for instances of the class. This makes the output longer if you write only one instance of the class, but is more efficient if you write several (different) instances of it. It's not clear to me if your example actually uses a class, but this is the general reason why JOS is longer than one would expect.

BTW: this is incidental, but I don't think JSON records class names (as you have in your example), and so it might not do what you need.

2008-12-09 13:00:12

Answer 7

A:

For what you appear to be doing doing (guessing at your requirements) take a look at the Google Protocol Buffer that Joh Skeet mentions above. Some really amazing work that we found to be easy to implement with all the mentioned benifits.

Disclaimer -> I don't work for Google!

pn1 dude 2008-12-09 18:06:27

Answer 8

A:

The reason why storing a tiny amount of information is serial form is relatively large is that it stores information about the classes of the objects it is serialising. If you store a duplicate of your list, then you'll see that the file hasn't grown by much. Store the same object twice and the difference is tiny.

The important pros are: relatively easy to use, quite fast and can evolve (just like XML). However, the data is rather opaque, it is Java-only, tightly couples data to classes and untrusted data can easily cause DoS. You should think about the serialised form, rather than just slapping implements Serializable everywhere.

Tom Hawtin - tackline 2008-12-14 17:08:38

Answer 9

A:

If you don't have too much data, you can save objects into a java.util.Properties object. An example of a key/value pair would be user_1234_firstname = Peter. Using reflection to save and load objects can make things easier.

2009-05-15 01:44:51

ansaurus

tags:

views:

answers:

How does Java's serialization work and when it should be used instead of some other persistence technique?

related questions