views:

391

answers:

4

I have a working ChangeTrackingList implementation which does its job just fine, but I'd like to "filter" its contents when sending it from the client back to the server so that it only includes the changes. Getting the changes is easy, as my list exposes a GetChanges method for just that purpose. How can I interrupt the DataContractSerializer and substitute List.GetChanges() where List used to be?

More detail: Consider a parent/child relationship in which I have a single parent with multiple children, each of which has a reference back to the parent, such as Customer/Orders. Serializing the entire child list over to the client app is fine, since I need to show all the orders. I don't want to bring all the orders back over to the server when I save, however, just the changes.

Difficulty: I have already looked at implementing ISerializable, and implementing my own GetObjectData, which wouldn't be very hard if not for the fact that I need to preserve object references as well. If I point a DataContractSerializer at my graph, and enable PreserveObjectReferences (Either by adding a behavior, or explicitly through the constructor), I'll get a very nice graph with no duplications, but it will want to include my entire ChangeTrackingList. If I implement ISerializable, I can write out my ChangeTrackingList manually, and include only the changes, but those child objects won't know their parent references anymore.

Clarification: This is a highly simplified example meant to illustrate the problem. I'm not looking for alternate solutions to this particular problem. My real-world problem does not involve customers, orders, or line items in any way. Alternate solutions to the Customer/Order problem are not the answers I'm looking for.

I am, quite simply, looking for a way to serialize only the "interesting" parts of an object graph. The mechanism for identifying and filtering down to "interesting" is already done, I just need a way to substitute those parts into the object graph during serialization.

Another Example: Let's say we have a "Person" entity with a collection of "Phone" entities below it. Phones are invalid without a parent, and the business rules say a Person is invalid without at least one phone. I cannot simply save the person, and then the phones in two separate calls, because each call would be an error. I have to save them as a graph in one single call. Later, if I update the Person to change their address, which is stored on Person, and add a new phone number as well, I need to send the updated person as well as the changes to the list of phones. I don't want to send the original phone because it hasn't changed.

It's a contrived example again, but one more closely resembles the real life problem. I have parents which are invalid wihout at least one child, and no child is valid without a parent.

Update: It looks like the "substitution" I want to perform can be done by way of a DataContractSurrogate class. I've seen a few examples of this, and they are relatively straightforward, but that's because their examples are as well. They are usually of the "Swap EmployeeSurrogate for Employee" variety, where "Employee" is some non-serializable class. In my case things get weirder because the class I want to swap is a generic type.

So, a simpler version of the problem might be this. Lets say I have a totally non-serializable MyList class. (And before someone suggests it, replacing the MyList class is not a valid solution, either. Remember folks, it's just an example.) I want to set up a DataContractSurrogate so that whenever a MyList appears in my object graph, I want to convert it to a simple array for serialization.

Does THIS sound like a valid direction to anyone? Has anyone ever tried to surrogate a generic type? Am I insane for even considering it?

A: 

You are approaching this the wrong way. You want to have one set of serialization behavior when you send the list to the client, and another set of behavior when you send it back to the server.

You should have explicit code on the client which will call GetChanges and then send that trimmed list back to the server.


Since you seem to have one root with only a fraction of children changed, and you only want to send back those children that have changed, you will need to create a new type that has a list of the children that you want changed, instead of the whole object graph.

In other words, you need to create a List<T> (or some other appropriate container) and serialize THAT back to the server. The thing is, you won't have the fully rehydrated object, only the children that were changed.

If you did need the fully rehydrated object, then you should send back the entire object graph regardless.

casperOne
I can explicitly send back changes, no problem. My problem is in sending back only the interesting parts of an entire graph. For example, the user has added several line items to an existing order, resulting in changes to the parent, as well as the addition of several new children, and possibly the modification of existing children. This save needs to be atomic, so while I can explicitly send the parent changes, as well as the changes to the children, I need to do it all in one shot. This example is highly simplified, of course. The real-world problem is a bit more... complicated.
Mel
@Mel: But aren't all the objects that are changed connected to each other in the object graph? If so, I don't see the problem with sending only the objects which have changes in the graph among a list of objects.If you are talking about trying to send back only the parent and/or child, that's a different story, you will need a new container to send that information back in.
casperOne
Yes, the children are connected to their parent, but not ALL the children have changed. I am saving the parent, in this case, and only want to serialize the children that have changed. I'll expand the question to include another example.
Mel
A: 

I've been wrestling with a similar issue. I think @casperOne is on the right track. The communication from the server to the client should contain the full object graph in one big chunk. But changes to the full object graph should be done with small helper methods that exchange a limited subset of data.

For instance, you'd get the full Customer/Orders graph using a ReadFullCustomer() operation, then call back to the server with an AddOrder() method, passing just the order information needed for that one task. Let your server's domain logic handle the bit about keeping the parent record consistent with that of its children.

Rory Primrose has an article about WCF service contract design that talks about chunky vs. chatty interfaces. In your case, it sounds like you need a combination of both: a chunky read operation with chatty modification operations.


EDIT: I'm not sure I've grasped your scenario fully, @Mel. It sounds like you want to change the object graph during serialization. The only way I know to do that would be to have the classes in your graph implement IXmlSerializable and rolll your own WriteXml and ReadXml methods. The DataContractSerializer understands classes that implement IXmlSerializable, so it might be worth a try.

dthrasher
I agree, and in most cases, this is what I have done. As I pointed out, the Customer/Order example is simple and contrived. I really do have a real-world situation in which I actually do need to send a graph, only 5% of which will have changed, and I'm trying to create the general purpose solution to this problem so that I don't have to solve it again later. I could also "walk" the tree and make a copy that only includes the changes, but I'd much rather solve it via the serialization so that it will apply everywhere.
Mel
A: 

The solution is to recognize that you have two different graphs: the graph of the original objects, and the change-graph. These are not of the same type. The original will have an object of type Customer, but the other will have an object of type ChangedCustomer. Customer may have a collection of Orders, but ChangedCustomer would have a collection of ChangedOrders, etc.

Your GetChanges operation will return a list of ChangedCustomer, not a list of Customer.

John Saunders
I think this would fall under the heading of creating an entire alternate object graph, which I am trying to avoid. I could create something a visitor to walk the graph and do the pruning, and then pass the new root object to the service. I'm just trying to create a much more "hands off" solution, so that we don't have as many moving parts.I can make ChangeTrackingList implement ISerializable, and do the pruning during custom serialization which would be THE solution if not for the reference preservation problem.I have two peices of the answer, but they don't fit together.
Mel
They don't fit because you are trying to make this simpler than possible. Remember - as simple as possible, but no more than that.
John Saunders
By "two pieces" I am referring to custom serialization, in which case I can decide exactly how a ChangeTrackingList should be serialized, and write out only the changes easily. The second piece is DataContractSerializer, which knows how to preserve object references to avoid circular reference problems.I could derive from or re-implement DataContractSerializer, for example, and solve the problem that way, but I was really hoping someone knew a way to inject code into the DCS "pipeline". It's a much simpler solution, I would think.
Mel
Strike that, DCS is sealed, brain-fart.
Mel
+1  A: 

Okay, at last I have something that works, and wanted to share the answer. Unfortunately I can't simply share the code since it was written on the client's time.

The key to the solution is the DataContractSurrogate class, which you can read about here and here. Ordinarily, you would use this to provide a stand in for an unserializable class in an object graph.

Pretend for purposes of illustration that we have a MyList class that is not serializable. We need to create a surrogate class to act as its "stand-in". The naming conventions here are pretty horrible, since the one called MyListSurrogate is really more of a factory, and the one called MyListSurrogated is what I would consider to be the actual surrogate. Anyway, the "surrogated" class exposes a plain array or List, and is marked as a DataContract. This is the class that will go over the wire. The MyListSurrogate class implements IDataContractSurrogate, and implements four important methods.

The GetDataContractType method returns the stand-in type given the entity type. When given the type MyList, it should return MyListSurrogated. Any other type should just return the original type. This method involves some messy reflection, and so I'll include the code for that rather than just explain it.

public Type GetDataContractType(Type type)
{
    if(type.IsGenericType 
        && (type.GetGenericTypeDefinition() == typeof(MyList<>)))
    {
        var itemType = type.GetGenericArguments()[0];
        var result = typeof(MyListSurrogated<>)
            .GetGenericTypeDefinition().MakeGenericType(itemType);
        return result;
    }
    return type;
}

In a similar fashion, GetObjectToSerialize turns a MyList instance into a MyListSurrogated instance. And GetDeserializedObject turns a MyListSurrogated instance back into a MyList instance. IDataContractSurrogate includes several other methods that we will not be using, and I've just returned null for most of them.

An instance of the MyListSurrogate class can then be passed to the DataContractSerializer constructor, and it will be called during serialization and deserialization to substitute types as needed on the fly. Unfortunately, there is no simple way to specify a surrogate class through configuration, so if you're not instantiating your own DataContractSerializer, you'll have to implement a DataContractSerializerOperationBehavior, which you can read about here. Applying the operation behavior is similar to the method described here.

To use surrogates, you will have to deal with some known type problems. In my example, the MyListSurrogated type would have to be added to the list of known types for the service you are trying to call. You can either do this manually, or in my case as part of a code generation template.

One last thing. My original goal was to create a ChangeTrackingList that would carry its entire contents from server to client, but only the changes from client to server. This is actually really simple once the surrogate has been mapped. My ChangeTrackingList has a boolean property called "IncludeOriginalsWhenSerializing" that defaults to true. The surrogate class' GetObjectToSerialize method looks at this flag to decide whether or not to copy the original items into the stand-in. On the other end of the line, the GetDeserializedObject method recreates the original ChangeTrackingList, and then sets the flag to false during deserialization. So, a list with the flag set to true goes in one end, and comes out the other end with the flag set to false. It's simple, it's automatic, it's finished.

Mel