views:

67

answers:

3

I have a large amount of data stored in a Collection.
I would like to save this data to a file. Is it a good idea to use Serialization?
Or should I use a custom format to save the data, or save it as XML for example?
(The elements in the Collection are custom classes. Do I need to implement a method that serializes the objects?)

+1  A: 

You can use both methods. I would prefer to save them as XML, because it is less likely to have a data corruption in XML file. But if you want to save custom class into data file using serialization you need to implement Serializable interface in those custom classes.

nuwan
Andrew Thompson
Why would using serialisation make it more likely to lead to data corruption?? In fact XML limits text to a subset of characters, unless you base64 encode, or similar.
Tom Hawtin - tackline
@Andrew Thompson You are thinking of Swing. Serialisation in AWT wasn't well thought out. `XMLEncoder` and `XMLDecoder` are a hack (look in the source code, and see all the hacks to handle standard classes in other packages).
Tom Hawtin - tackline
@Tom Hawtin. I never mentioned data corruption. My understanding is that the data might be serialized in different forms according to different JVMs. And how did all the discussion of AWT/Swing come into it? I was referring to POJOs.
Andrew Thompson
@Andrew Thompson My original comment refers to the answer. AWT/Swing is as far as I am aware the only place in the Java library where serial compatibility is not standardised. However, computation of `serialVersionUID` may work out differently between compilers (unfortunately the specification for automatic generation includes synthetic method), hence the recommendation and warnings to be explicit.
Tom Hawtin - tackline
A: 

Do I need to implement a method that serializes the objects?

In order to serialize an object you should implement the Serializable interface OR provide the implementation for the following methods IF a 'special'handling during the serialization and deserialization process is required.

- private void writeObject(ObjectOutputStream out) throws IOException;
- private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException;

You can find more details on serialization on the oracle site. You can visit http://java.sun.com/developer/technicalArticles/Programming/serialization/ to get started.

andreas
A: 

I would not write a serialized class to disk. Once the JVM or any libraries change it might be useless. This means a file created on one system may not work on another!

Instead, I'd write a text version of the data. If your collection includes other collections or classes, I'd use XML as it handles nesting well. If the data is simple I'd probably just write a comma-sep file with a header line including a version number and a description of the data set, a line telling the column names, the data lines, and an EOF line.

Tony Ennis
"This means a file created on one system may not work on another!" What??
Tom Hawtin - tackline
That's right. Serialized (not XML'ed or what have you - serialized) is dependent on individual jar versions on each system. Suppose you serialize a class that contains an int. Then it gets de-serialized on a system where that class has a two ints - what happens? I ran into this very thing years ago when I was using RMI to pass serialized objects from system to system.
Tony Ennis
Here's an SO thread on this very thing. I can't remember the details now, it's been 10 years or more, but 'serialVersionUID' rings a bell. http://stackoverflow.com/questions/1576703/java-rmi-marshalexception-failed-to-communicate
Tony Ennis
@Tom Hawtin I just learned if you "@" someone's nickname they get a notification :-)
Tony Ennis
You do need to read the Object Versioning section of the Object Serialization Specification. It covers all this and much more. Serialization is dependent on the serialVersionUID of every affected class: you need to control these and also control what kinds of versioning happens to them over time. What happens in the specific case cited here is that the new field takes on its default value, which is perfectly reasonable.
EJP
@Tony Ennis Yes, you need a `serialVersionUID`. Your compiler will warn you; your editor will warn you; your colleagues will warn you. Like XML formats, serialisation should be designed. It's not just slap an `implements java.io.Seralizable` on and it's done.
Tom Hawtin - tackline
Fair enough, there's always something more to learn. Oh, and my colleagues did NOT warn me, sadly, heh.
Tony Ennis
@Tony Ennis Make sure your colleagues' other colleague warns them. :)
Tom Hawtin - tackline
@Tom Hawtin - sage advice!
Tony Ennis