views:

234

answers:

1

I'm personally committed to .net distributed caching solutions, but I think this question is interesting across all platforms.

Is there a distributed caching solution (or generic strategy) that allows to both store objects in the cache while maintaining the integrity of the references between them?

To exemplify - Suppose I have an object Foo foo that references an object Bar bar and also and object Foo foo2 that references that same Bar bar. If I load foo to the cache, a copy of bar is stored along with it. If I also load foo2 to the cache, a separate copy of bar is stored along with that. If I change foo.bar in the cache, the change does not impact foo2.bar :(

Is there an existing distributed cache solution that will enable me to load foo, foo2 and bar into the cache while maintaining the foo.bar foo2.bar references?

+3  A: 

First and foremost

I do not know of any distributed system, and I do not pretend to build one. This post explains how you can simulate this behavior with .NET and C# using the IObjectReference interface with serializable objects.

Now, lets go on with the show

I do not know of such a distributed system, but you can somewhat easily achive this with .NET using the IObjectReference interface. Your implementation of ISerializable.GetObjectData would need to call SerializationInfo.SetType to point out a proxy class that implements IObjectReference, and would be able (with help from data provided by your GetObjectData method) to get a reference to the real object that should be used.

Example code:

[Serializable]
internal sealed class SerializationProxy<TOwner, TKey> : ISerializable, IObjectReference {
    private const string KeyName = "Key";
    private const string InstantiatorName = "Instantiator";
    private static readonly Type thisType = typeof(SerializationProxy<TOwner, TKey>);
    private static readonly Type keyType = typeof(TKey);

    private static readonly Type instantiatorType = typeof(Func<TKey, TOwner>);
    private readonly Func<TKey, TOwner> _instantiator;
    private readonly TKey _key;

    private SerializationProxy() {
    }

    private SerializationProxy(SerializationInfo info, StreamingContext context) {
        if (info == null) throw new ArgumentNullException("info");

        _key = (TKey)info.GetValue(KeyName, keyType);
        _instantiator = (Func<TKey, TOwner>)info.GetValue(InstantiatorName, instantiatorType);
    }

    void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context) {
        throw new NotSupportedException("This type should never be serialized.");
    }

    object IObjectReference.GetRealObject(StreamingContext context) {
        return _instantiator(_key);
    }

    internal static void PrepareSerialization(SerializationInfo info, TKey key, Func<TKey, TOwner> instantiator) {
        if (info == null) throw new ArgumentNullException("info");
        if (instantiator == null) throw new ArgumentNullException("instantiator");

        info.SetType(thisType);
        info.AddValue(KeyName, key, keyType);
        info.AddValue(InstantiatorName, instantiator, instantiatorType);
    }
}

This code would be called with SerializationProxy.PrepareSerialization(info, myKey, myKey => LoadedInstances.GetById(myKey)) from your GetObjectData method, and your LoadedInstances.GetById should return the instance from a Dictionary<TKey, WeakReference> or load it from cache/database if it isnt already loaded.

EDIT:

I've wrote some example code to show what I mean.

public static class Program {
 public static void Main() {
  // Create an item and serialize it.
  // Pretend that the bytes are stored in some magical
  // domain where everyone lives happily ever after.
  var item = new Item { Name = "Bleh" };
  var bytes = Serialize(item);

  {
   // Deserialize those bytes back into the cruel world.
   var loadedItem1 = Deserialize<Item>(bytes);
   var loadedItem2 = Deserialize<Item>(bytes);

   // This should work since we've deserialized identical
   // data twice.
   Debug.Assert(loadedItem1.Id == loadedItem2.Id);
   Debug.Assert(loadedItem1.Name == loadedItem2.Name);

   // Notice that both variables refer to the same object.
   Debug.Assert(ReferenceEquals(loadedItem1, loadedItem2));

   loadedItem1.Name = "Bluh";
   Debug.Assert(loadedItem1.Name == loadedItem2.Name);
  }

  {
   // Deserialize those bytes back into the cruel world. (Once again.)
   var loadedItem1 = Deserialize<Item>(bytes);

   // Notice that we got the same item that we messed
   // around with earlier.
   Debug.Assert(loadedItem1.Name == "Bluh");

   // Once again, force the peaceful object to hide its
   // identity, and take on a fake name.
   loadedItem1.Name = "Blargh";

   var loadedItem2 = Deserialize<Item>(bytes);
   Debug.Assert(loadedItem1.Name == loadedItem2.Name);
  }
 }

 #region Serialization helpers
 private static readonly IFormatter _formatter
  = new BinaryFormatter();

 public static byte[] Serialize(ISerializable item) {
  using (var stream = new MemoryStream()) {
   _formatter.Serialize(stream, item);
   return stream.ToArray();
  }
 }

 public static T Deserialize<T>(Byte[] bytes) {
  using (var stream = new MemoryStream(bytes)) {
   return (T)_formatter.Deserialize(stream);
  }
 }
 #endregion
}

// Supercalifragilisticexpialidocious interface.
public interface IDomainObject {
 Guid Id { get; }
}

// Holds all loaded instances using weak references, allowing
// the almighty garbage collector to grab our stuff at any time.
// I have no real data to lend on here, but I _presume_ that this
// wont be to overly evil since we use weak references.
public static class LoadedInstances<T>
 where T : class, IDomainObject {

 private static readonly Dictionary<Guid, WeakReference> _items
  = new Dictionary<Guid, WeakReference>();

 public static void Set(T item) {
  var itemId = item.Id;
  if (_items.ContainsKey(itemId))
   _items.Remove(itemId);

  _items.Add(itemId, new WeakReference(item));
 }

 public static T Get(Guid id) {
  if (_items.ContainsKey(id)) {
   var itemRef = _items[id];
   return (T)itemRef.Target;
  }

  return null;
 }
}

[DebuggerDisplay("{Id} {Name}")]
[Serializable]
public class Item : IDomainObject, ISerializable {
 public Guid Id { get; private set; }
 public String Name { get; set; }

 // This constructor can be avoided if you have a 
 // static Create method that creates and saves new items.
 public Item() {
  Id = Guid.NewGuid();
  LoadedInstances<Item>.Set(this);
 }

 #region ISerializable Members
 public void GetObjectData(SerializationInfo info, StreamingContext context) {
  // We're calling SerializationProxy to call GetById(this.Id)
  // when we should be deserialized. Notice that we have no
  // deserialization constructor. Fxcop will hate us for that.
  SerializationProxy<Item, Guid>.PrepareSerialization(info, Id, GetById);
 }
 #endregion

 public static Item GetById(Guid id) {
  var alreadyLoaded = LoadedInstances<Item>.Get(id);
  if (alreadyLoaded != null)
   return alreadyLoaded;

  // TODO: Load from storage container (database, cache).
  // TODO: The item we load should be passed to LoadedInstances<Item>.Set
  return null;
 }
}
Simon Svensson
Simon, thank you for the elaborate reply. I'm afraid it went a bit over my head. Could you explain how the serialization proxy relates to the distributed cache where I intend to store the objects?
urig