views:

110

answers:

1

Using boost::serialization, what's the "best" way to serialize an object that contains cached, derived values in mutable members, such that cached members aren't serialized, but on deserialization, they are initialized the their appropriate default. A definition of "best" follows later, but first an example:

class Example
{
public:
    Example(float n) : 
        num(n),
        sqrt_num(-1.0)
    {}

    float get_num() const { return num; }

    // compute and cache sqrt on first read
    float get_sqrt() const
    { 
        if(sqrt_num < 0) 
            sqrt_num = sqrt(num);
        return sqrt_num;
    }

    template <class Archive> 
    void serialize(Archive& ar, unsigned int version)
    { ... }
private:
    float num;
    mutable float sqrt_num;
};

On serialization, only the "num" member should be saved. On deserialization, the sqrt_num member must be initialized to its sentinel value indicating it needs to be computed. What is the most elegant way to implement this? In my mind, an elegant solution would avoid splitting serialize() into separate save() and load() methods (which introduces maintenance problems).

One possible implementation of serialize:

    template <class Archive> 
    void serialize(Archive& ar, unsigned int version)
    {
        ar & num;
        sqrt_num = -1.0;
    }

This handles the deserialization case, but in the serialization case, the cached value is killed and must be recomputed. Also, I've never seen an example of boost::serialize that explicitly sets members inside of serialize(), so I wonder if this is generally not recommended.

Some might suggest that the default constructor handles this, for example:

int main()
{
    Example e;
    {
        std::ifstream ifs("filename");
        boost::archive::text_iarchive ia(ifs);
        ia >> e;
    }
    cout << e.get_sqrt() << endl;

    return 0;
}

which works in this case, but I think fails if the object receiving the deserialized data has already been initialized, as in the example below:

int main()
{
    Example ex1(4);
    Example ex2(9);

    cout << ex1.get_sqrt() << endl; // outputs 2;
    cout << ex2.get_sqrt() << endl; // outputs 3;


    // the following two blocks should implement ex2 = ex1;


    // save ex1 to archive
    {
        std::ofstream ofs("filename");
        boost::archive::text_oarchive oa(ofs);
        oa << ex1;
    }

    // read it back into ex2
    {
        std::ifstream ifs("filename");
        boost::archive::text_iarchive ia(ifs);
        ia >> ex2;
    }


    // these should be equal now, but aren't,
    // since Example::serialize() doesn't modify num_sqrt
    cout << ex1.get_sqrt() << endl;  // outputs 2;
    cout << ex2.get_sqrt() << endl;  // outputs 3;

    return 0;
}

I'm sure this issue has come up with others, but I have struggled to find any documentation on this particular scenario.

Thanks!

+1  A: 

Splitting your saving and loading methods doesn't mean you have to maintain two copies of your serialization code. You can split them and then join them back again with a common function.

private:
  friend class boost::serialization::access;

  BOOST_SERIALIZATION_SPLIT_MEMBER()

  template <class Archive>
  void save(Archive& ar, const unsigned int version) const {
      const_cast<Example*>(this)->common_serialize(ar, version);
  }

  template <class Archive>
  void load(Archive& ar, const unsigned int version) {
      common_serialize(ar, version);
      sqrt_num = -1;
  }

  template <class Archive>
  void common_serialize(Archive& ar, const unsigned int version) {
      ar & num;
  }

You probably noticed the const_cast. That's an unfortunate caveat to this idea. Although the serialize member function is non-const for saving operations, the save member function needs to be const. As long as the object you're serializing wasn't originally declared const, though, it's safe to cast it away as shown above. The documentation briefly mentions the need to cast for const members; this is similar.

With the changes above, your code will correctly print "2" for both ex1 and ex2, and you only have to maintain one copy of the serialization code. The load code only contains code specific to re-initializing the object's internal cache; the save function doesn't touch the cache.

Rob Kennedy