tags:

views:

432

answers:

7

How to store an object to disk in all its glory? My object is derived from TObjectList so it holds other objects.

Which is the fastest and easiest way? Which is the compatible way?

Serialization IS NOT a solution since I want to save also non-public properties and the list of objects it holds!

For the moment I trying to save every object independently as a binary file which then are packed together. This is a lengthy process but allows me to load old versions of the object with a newer version of the program (compatibility with previous saved projects). Anyway, the complexity started to grow and it is not looking good anymore.

A: 

You need to come up with some way of encoding that object as a string so that when you construct a new object with that string you get the same object. This is called serialization, and some languages provide that functionality for you.

For example if you have this object:

class serialize_me {
 private:
   int a;
   float b;
 public:
   double c;
}

and a = 5, b = 3.2, and c = 67.5

your string might look like this:

a5b3.2c67.5

Then you could parse that string and assign the appropriate values to all the members.

I think it goes without saying that strings are easy to store on disk.

EDIT: This is a very basic example, but I think you can understand the concept easily enough.

EDIT: Delphi specific serialization. At the bottom of the page there is a link to a full-fledged XML serializer class.

just_wes
What do you do when your object contains other objects?
Altar
@Altar: Recursion.
Larry Lustig
"Delphi specific serialization" is limited to public/published properties only.
Altar
Any external serialization solution must rely on some form of property publication mechanism to get values from the serialized object. If you want those properties saved, simply make them public. Alternatively, you will have to roll your own mechanism with an implementation inside each specific class that knows how to serialize itself. This is more work, but of course you have access to everything in the class.
Larry Lustig
There is no requirement whatsoever to serialize to a string. A string format may have benefits with regards to byte order differences and ease of debugging, but binary formats can be smaller and result in higher performance when loading and saving. The early Delphi DFM format was binary for these reasons.
mghie
to mgie - so you suggest that my solution (writing the object binary to disk) is still the best option?
Altar
@Altar - that is still serialization as "serialization" refers to the concept of representing an object without having the object. Note: binary is still a string even if it's not a char string. Note2: The link I found was meant to help you come up with ideas, and it is certainly not the end all and be all of Delphi serialization.
just_wes
Not sure if I deserved a down vote for this post.
just_wes
@just_wes: Agreed with the down vote being rude, no reason for that. But your idea of what a string is looks a little weird to me... at least for Delphi programmers it has a certain meaning.
mghie
@just_wes: I'm sure you didn't deserve a downvote. I was dissuaded from offering my own attempt at helpful advice because I saw that your and other's attempts were being downvoted.
Herbert Sitz
@Altar: Only you can say whether binary format is the best option. There are pros and cons, you need to weigh them. The problem of resolving dependencies between objects remains the same, whether text or binary representation is being used - you can't simply serialize pointers in either case. For a tree the dependencies can be part of the serialization data, for graphs you need something like object ids that can be resolved to addresses after deserializing the complete structure.
mghie
@Larry: Delphi seems to be the only language which can only serialize public (or published) properties. In C# and Java, binary serialization does not care about access modifiers such as private - all nontransient fields are considered part of an object's persistent state and are eligible for persistence. And exposing private fields by adding public or published properties is a bad idea. They are private for a reason. And for deserialization, a setter is needed. So everybody can set the private variable now?
mjustin
@mjustin: I'm no expert on the internals of C# and Java, but those are managed languages and I'm pretty sure that nothing about Objects in those languages is considered private from the virtual machine that manages them. Thus, Delphi is not "alone" in its ability to serialize only accessible object fields. Delphi will behave similarly to other non-managed languages while managed languages inherently make all object members accessible to the code that performs the serialization.
Larry Lustig
Or, to rephrase what I just said, the management of objects by a virtual machine constitutes a publication mechanism allowing access to the object's internals.
Larry Lustig
@Larry: managed or not does not make the difference: starting with Delphi 2010, extended RTTI allows to serialize the object state without the need to make everything public or published.
mjustin
A: 

Lots of ways of doing this I guess, in the past I have used inifiles for this purpose, useful as TMemInifile is there for you to do a lot of grunt work in memory before you save out to disk. Since you have a hierarchy to save out you might want to consider using something like XML, but I haven't done this personally so cannot advise on that one.

Drew Gibson
A: 

If the object is derived from TPersistent, XML or JSON serialization are easy to do with open source libraries:

One way to make versioning easier is an anti-corruption layer between the domain model and the persistence layer. (maybe using data transfer objects which will not change with every change of the domain model).

For automatic versioning see this article: Migrate Serialized Java Objects with XStream and XMT

XMT introduces class VersionedDocument to version serialized XMLs and handle the migration. The same design could easily be implemented with Delphi.

mjustin
OmniXML is as limited as any other serialization method I have found until now. They say: "note: class should be derived from TPersistent and properties should be published". Serialization is not a real solution to store an ENTIRE object to disk but only some of its parts (the published properties).
Altar
Serialization via published properties works for the whole VCL; there's no reason not to use it. If you have control over the makeup of your objects, consider adding published properties to expose what you need to serialize. The published properties don't have to be of the same types as the underlying private members, e.g. you have have a private TDateTime field, but expose it as a string through a published prop. If not that, you could add a read/write .AsString property to each object class, so that each object reads and writes its own state. Either way, you'll have to do *something*.
moodforaday
@Altair + 1 you are right, serialization should include all fields of the object regardless of visibility (because this is the whole purpose of serialization: save the full object state). Delphi has been limited to published and (using $M) public fields in the past. Hopefully, with the new extended RTTI, the libraries will close this gap.
mjustin
@moodforaday: exposing private fields by adding public or published properties is a bad idea. They are private for a reason. And for deserialization, a setter is needed. So everybody can set the private variable now?
mjustin
@mjustin: *someone* has to set them. It's either going to be the object itself, or some "reader". If the former, you protect private fields at the cost of implanting persistence logic (and maybe even format) into each class that needs it. If the latter, you separate concerns, but you expose the persisted fields of your classes. It is what it is. It doesn't have to be "everybody" though: you can declare the reader class in the same unit as the persisted object, and get/set set the object's protected fields. Only the reader will be allowed to do it, and if the object changes, so does the reader.
moodforaday
...one more thought: maybe private fields by definition should not need persistence? The two concepts are somewhat contradictory: if a private value is never affected by the outside world (so no setter method), then it is probably computed by the object itself - hence no need to store it. Storing a value does presume that the value is "mutable" and depends on something other than just the object. So: *how do your private fields get their values in the first place*? An answer to this question might help you decide whether you need persistence and whose responsibility it should be.
moodforaday
@ moodforaday: starting with Delphi 2010, extended RTTI allows to serialize the object state without the need to make everything public or published.
mjustin
@ moodforaday: to restore the value of a private field which has been computed depending on 'external' values, you have to know all these values, and serialize them too in exact the same sequence they have been set, from object creation to the current point in time.
mjustin
"...one more thought: maybe private fields by definition should not need persistence?" --- This will involve a complex call to all kind of procedures and functions in a very specific order to recalculate all internal fields. Also you need to load the external "triggers". This way more complicated and may result in an object that after loading (from disk) is not perfect identical with the original one.
Altar
+4  A: 

If you are using Delphi 2010, things will be much easier because of the new RTTI unit, Robert Love has written a nice unit to serialize the objects to XML called XMLSerial.

You can read about it at his blog : Xml Serialization - Basic Usage

Mohammed Nasman
This is as limited as any other serialization method I have found until now because properties should be published. Serialization is not a real solution to store an ENTIRE object to disk but only some of its parts (the published properties).
Altar
Did you actually check the D2010 angle?, since RTTI generation changed there!!!
Marco van de Voort
@Mohammad also jvAppXMLFileStorage can help, but only for published fields.@Marco in Delphi2010 also GetPropList return the published properties only.
Issam Ali
@Marco: you are right, and the question is: "can Delphi extended RTTI read and write values of private fields?" - afaik yes.
mjustin
Issam Ali: One probably needs to call a different function, since it is no longer limited to object properties, published methods and enums/ Googling for "2010" and "rtti" and look in the code
Marco van de Voort
@Issam, with extented RTTI in Delphi 2010, you can access to members other than published, take a look at RTTI unit instead of typinfo.
Mohammed Nasman
+1  A: 

I also mostly use handcrafted serialization for my own datastructures. The multiple version angle is one of the main reasons.

However in your case that is difficult, since not all your objects (tobjectlist) derive from an own hierarchy that contain virtual abstract methods to load/store.

D2010 serialization (which afaik allows nearly everything to RTTI) could be a solution, but probably requires a new delphi version, and worse, it spells the end to manually dealing with versioning. (e.g. copying values from old fields into new ones when the format changes)

If the manual streaming is getting out of hand, a different approach could be to have abstract definitions for the data part of your objects, and generate sourcecode (the field declarations and the streaming code) from these abstract definitions. The advantage is that you might be able to slip in some custom code here and there when you need, or patch your generator for versioning issues.

I did this once for a business object to SQL mapping with over 800 objects. Since it was the time before generics in Delphi, I generated a typesafe container type for each object too, as well as other helper and converter objects/routines.

It is a lot of work to setup though, and only is worth it if you have a project with really a lot of objects and fields (hundreds if not thousands) and are sure that you will need to maintain it with significant mutations for quite some time to come.

Marco van de Voort
"However in your case that is difficult, since not all your objects (tobjectlist) derive from an own hierarchy that contain virtual abstract methods to load/store." That's where interfaces come into play. Serialization has nothing to do with implementation inheritance. Have all objects implement a `ISerializable` and you're good to go.
mghie
Sure, possible. (though you will have to implement the IUnknown methods yourself and possiblyseveral times for differently rooted hierachies, since tobjectlist doesn't inherit from tinerfacedobject).
Marco van de Voort
+2  A: 

You state that serialization is not a solution but I ask why not? I've done something like this in the past but here is what I've done.

I created a component class that did nothing but serialize a non-TPersistant based object so that I could stream it in and out using VCL streaming capabilities.

For example:

//Please forgive me for any errors that exist, as I'm trying to type this from the top of my head. As well, this is not going to be functionally complete.

unit streamlist1;

interface

uses MyListObjectUnit;

procedure SaveList(fielname:string; data:TMyListObject);
procedure LoadList(filename:string; var data:TMyListObject);

implementation

type
  TMyListStreamer = class(TComponent)
  private
    fMyList : TMyListObject;
    procedure ReadList(Reader:TReader); //This is where the magic happens
    procedure WriteList(Writer: TWriter); //This is where the magic happens (x2)
  public
    procedure DefineProperties(Filer: TFiler); override; //defined in TPersistent
    procedure AssignMyList(data:TMyListObject);
    procedure PopulateData(var data:TMyListObject);
  end;


TMyListStreamer.procedure DefineProperties(Filer: TFiler); override; //defined in TPersistent
begin
  Filer.DefineProperty('MyObjList', ReadList, WriteList, true);
  //Filer.DefineBinaryProperty('MyObjList', ReadList, WriteList, true); //your choice
end;

procedure TMyListStreamer.ReadList(Reader:TReader); //This is where the magic happens
begin
  //Use the reader class to read in anything you want...
end;

procedure TMyListStreamer.WriteList(Writer: TWriter); //This is where the magic happens (x2)
begin
  //Use the writer class to write out anything you want...
end;

procedure SaveList(fielname:string; data:TMyListObject);
var
  wFile : TFileStream;
  wList : TMyListStreamer;
begin
  RegisterClass(TMyListStreamer);
  Try
    wFile := TFileStream.Create(filename, fmcreate);
    wList := TMyListStreamer.create(nil);
    try
      wList.AssignMyList(Data);
      wFile.WriteComponent(wList);
    finally
      wFile.Free;
      wList.free;
    end;
  finally
    Unregisterclass(TMyListStreamer);
  end;
end;

procedure LoadList(filename:string; var data:TMyListObject);
var
  wFile : TFileStream;
  wList : TMyListStreamer;
begin
  RegisterClass(TMyListStreamer);
  Try
    wFile := TFileStream.Create(filename, fmOpenRead);
    try
      wList := TMyListStreamer(wFile.ReadComponent(Nil));

      if assigned(data) and assigned(wList) then
        wList.PopulateData(data);

      if assigned(wList) then
        wList.free; 
    finally
      wFile.Free;
    end;
  finally
    Unregisterclass(TMyListStreamer);
  end;
end;

Using this method you can stream (serialize) anything out of the VCL or custom data. It takes a little to set up, but the strength is that you can control everything going in and out of the data file. You can even, with a little fore thought, create a version flag and process different data by ignoring or massaging specific data in newer versions of the program/component.

You can even stream other VCL objects out of your streaming component as long as you already know the type of object (ie TComponent/TPersistant based objects) by using the existing methods of the TReader/TWriter.

Not a full solution but it should get you where you want to go with a little more work.

Ryan J. Mills
> You state that serialization is not a solution but I ask why not? ------ <br> Until now, all streaming solution I have found were able to save ONLY published properties. Some people have troubles to understand that saving SOME of the attributes of an object is not the same as saving the entire object (so later you can restore its original state) :)
Altar
Even though I haven't provided the entire solution, using this method will allow you to save anything you want. I think that most of the other answers lean toward the same thing, but different answers to the same problem is one of the great things about programming. I love this hobby^H^H^H^H^H job.
Ryan J. Mills
>"Even though I haven't provided the entire solution, using this method will allow you to save anything you want". --- Yes, I know. I have seen your code. This is why I used "Until now". I am still weighting your solution and the one proposed by GrandmasterB.
Altar
And I gave you (also) a vote up.
Altar
Thanks for the vote. While my solution may not be the answer to your question it's always good to have options. That's a big part of what I like about this site.
Ryan J. Mills
+1  A: 

Your current solution is probably the best if you're going to have separate versions of the object over the years.

What I do is create SaveToStream() and LoadFromStream() methods, and manually write the object's properties to a tstream in a fixed order, prefixing it with a version number of the structure. The benefit of this, as you mention, is that you can better adapt to older versions of the stream. For example, if you have 5 versions, but you need to initialize something in a certain way for version 3 files, its easily done. You then wrap a SaveToFile() function around it that creates a TFileStream and calls SaveToStream().

I believe there's a TWriter class that lets you write various datatypes to the stream more easily... or you can create your own simply enough. (I made my own filestream descendent to handle this)

If saving multiple objects to a single stream, you may want to note the position before each object gets written, and then go back and mark a length, so that you (or someone accessing the files) can skip ahead through the file without reading it.

Also, if you have a hierarchy of classes that you want to save, 'bottom load' the ancester class with all the properties you want to save to a file. This way you only need one implementation of the save routine. Its a little less efficient since you're carrying around variables you dont necessarily need in all the objects, but its far simpler to manage.

GrandmasterB
Hi Grandmaster. I think I will continue on this road. Thanks.
Altar