views:

87

answers:

3

TinyXML

I have a XML file that keeps a bunch of data that is loaded into objects. Right now, I have one giant method that parses the XML file and creates the appropriate objects depending on the contents of the XML file. This function is very large and imports lots of class definitions.

Would it be better to each class type to do its own loading from XML. That way the XML code is dispersed throughout my files and not in one location. The problem is that I need to pass it the exact node inside the XML file where that function should read from. Is this feasible? I'm using tinyxml so if imagine each class can be passed the XML stream (an array containing the XML data actually) and then I'd also pass the root element for that object \images\fractal\traversal\ so it knows what it should be reading.

Then the saving would work the same way.

Which approach is best and more widely used?

A: 

I don't know anything about TinyXML, but I have been using that kind of class design with libxml2 for several years now and it has been working fine for me.

Remy Lebeau - TeamB
A: 

Serialization functions should be friends of the classes they serialize. If you want to serialize and deserialize to XML you should write friend function that perform this function. You could even write custom ostream & operator <<() functions that do this, but this becomes problematic if you want to aggregate objects. A better strategy is to define a mechanism that turns individual objects into Node's in a DOM document.

Jherico
A: 

I can think of an approach, based on a factory to serve up the objects based on a tag.

The difficulty here is not really how to decouple the deserialization of each object content, but rather to decouple the association of a tag and an object.

For example, let's say you have the following XML

<my_xml>
  <bird> ... </bird>
</my_xml>

How do you know that you should build a Bird object with the content of the <bird> tag ?

There are 2 approaches there:

  1. 1 to 1 mapping, ig: <my_xml> represents a single object and thus knows how to deserialize itself.
  2. Collection: <my_xml> is nothing more than a loose collection of objects

The first is quite obvious, you know what to expect and can use a regular constructor.

The problem in C++ is that you have static typing, and that makes the second case more difficult, since you need virtual construction there.

Virtual construction can be achieved using prototypes though.

 // Base class
 class Serializable:
 {
 public:
   virtual std::auto_ptr<XmlNode*> serialize() const = 0;
   virtual std::auto_ptr<Serializable> deserialize(const XmlNode&) const = 0;
 };

 // Collection of prototypes
 class Deserializer:
 {
 public:
   static void Register(Tag tag, const Serializable* item)
   {
     GetMap()[tag] = item;
   }

   std::auto_ptr<Serializable> Create(const XmlNode& node)
   {
     return GetConstMap()[node.tag()]->deserialize(node);
     // I wish I could write that ;)
   }

 private:
   typedef std::map<Tag, const Serializable*> prototypes_t;

   prototypes_t& GetMap()
   {
     static prototypes_t _Map;
     return _Map;
   }

   prototypes_t const& GetConstMap() { return GetMap(); }
 };

 // Example
 class Bird: public Serializable
 {
   virtual std::auto_ptr<Bird> deserialize(const XmlNode& node);
 };

 // In some cpp (bird.cpp is indicated)
 const Bird myBirdPrototype;
 Deserializer::Register('bird', myBirdPrototype);

Deserialization is always a bit messy in C++, dynamic typing really helps there :)

Note: it also works with streaming, but is a bit more complicated to put in place safely. The problem of streaming is that you ought to make sure not to read past your data and to read all of your data, so that the stream is in a 'good' state for the next object :)

Matthieu M.