views:

1543

answers:

7

Hello,

Soliciting feedback/options/comments regarding a "best" pattern to use for reference data in my services.

What do I mean by reference data?

Let's use Northwind as an example. An Order is related to a Customer in the database. When I implement my Orders Service, in some cases I'll want the reference a "full" Customer from an Order and other cases when I just want a reference to the Customer (for example a Key/Value pair).

For example, if I were doing a GetAllOrders(), I wouldn't want to return a fully filled out Order, I'd want to return a lightweight version of an Order with only reference data for each order's Customer. If I did a GetOrder() method, though, I'd probably want to fill in the Customer details because chances are a consumer of this method might need it. There might be other situations where I might want to ask that the Customer details be filled in during certain method calls, but left out for others.

Here is what I've come up with:

[DataContract]
public OrderDTO
{
    [DataMember(Required)]
    public CustomerDTO;

    //etc..
}

[DataContract]
public CustomerDTO
{
    [DataMember(Required)]
    public ReferenceInfo ReferenceInfo;

    [DataMember(Optional)]
    public CustomerInfo CustomerInfo;
}

[DataContract]
public ReferenceInfo
{
    [DataMember(Required)]
    public string Key;

    [DataMember(Required)]
    public string Value;
}

[DataContract]
public CustomerInfo 
{
    [DataMember(Required)]
    public string CustomerID;

    [DataMember(Required)]
    public string Name;

    //etc....
}

The thinking here is that since ReferenceInfo (which is a generic Key/Value pair) is always required in CustomerDTO, I'll always have ReferenceInfo. It gives me enough information to obtain the Customer details later if needed. The downside to having CustomerDTO require ReferenceInfo is that it might be overkill when I am getting the full CustomerDTO (i.e. with CustomerInfo filled in), but at least I am guaranteed the reference info.

Is there some other pattern or framework piece I can use to make this scenario/implementation "cleaner"?

The reason I ask is that although we could simply say in Northwind to ALWAYS return a full CustomerDTO, that might work fine in the simplistic Northwind situation. In my case, I have an object that has 25-50 fields that are reference/lookup type data. Some are more important to load than others in different situations, but i'd like to have as few definitions of these reference types as possible (so that I don't get into "DTO maintenance hell").

Opinions? Feedback? Comments?

Thanks!

+1  A: 

It seems like a complicated solution to me. Why not just have a customer id field in the OrderDTO class and then let the application decide at runtime whether it needs the customer data. Since it has the customer id it can pull the data down when it so decides.

sipwiz
I will expose ability to get Customer data given a CustomerID, but trying to avoid a "chatty" interface.
Brian
Elaborating on what I mean by the "chatty" interface. Suppose I display the results of GetAllOrders() in a grid. One column in the grid is "Customer Name". I wouldn't want to have to call GetCustomer() for every order I display in the grid.In addition, I dont want to break my contract if a new requirement stated that in addition to Customer Name we also need to display, for example, Customer Phone Number. I'd just want to be able to switch my method call to something like so:<br>GetAllOrders(expandReference=true);Note: not sure if Phone No is actually in Northwind or not.
Brian
So you would have logic that returns partial customer details based on the key-value pairs in CustomerDTO? It's taken me a while to grasp your mechanism which could be a sign it's overly complex or that I'm slow. If it was me I'd definitely be trying to do it with only two classes: Order and Customer. Adding an extra method to your interface for say GetAllOrdersPartial() seems a lot lesser of an evil than creating additional somewhat obscure classes. If I was having to maintain your code I'd prefer a few extra method calls instead of working out some tricky optimisation for the same effect.
sipwiz
+1  A: 

I've decided against the approach I was going to take. I think much of my initial concerns were a result of a lack of requirements. I sort of expected this to be the case, but was curious to see how others might have tackled this issue of determining when to load up certain data and when not to.

I am flattening my Data Contract to contain the most used fields of reference data elements. This should work for a majority of consumers. If the supplied data is not enough for a given consumer, they'll have the option to query a separate service to pull back the full details for a particular reference entity (for example a Currency, State, etc). For simple lookups that really are basically Key/Value pairs, we'll be handling them with a generic Key/Value pair Data Contract. I might even use the KnownType attribute for my more specialized Key/Value pairs.

[DataContract]
public OrderDTO
{
    [DataMember(Required)]
    public CustomerDTO Customer;

    //in this case, I think consumers will need currency data, 
    //so I pass back a full currency item
    [DataMember(Required)]
    public Currency Currency; 

    //in this case, I think consumers are not likely to need full StateRegion data, 
    //so I pass back a "reference" to it
    //User's can call a separate service method to get full details if needed, or 
    [DataMember(Required)]
    public KeyValuePair ShipToStateRegion;

    //etc..
}


[DataContract]
[KnownType(Currency)]
public KeyValuePair
{
    [DataMember(Required)]
    public string Key;

    [DataMember(Required)]
    public string Value;

    //enum consisting of all possible reference types, 
    //such as "Currency", "StateRegion", "Country", etc.
    [DataMember(Required)]
    public ReferenceType ReferenceType; 
}


[DataContract]
public Currency : KeyValuePair
{
    [DataMember(Required)]
    public decimal ExchangeRate;

    [DataMember(Required)]
    public DateTime ExchangeRateAsOfDate;
}


[DataContract]
public CustomerDTO 
{
    [DataMember(Required)]
    public string CustomerID;

    [DataMember(Required)]
    public string Name;

    //etc....
}

Thoughts? Opinions? Comments?

Brian
This sounds like a reasonable compromise, but be prepared for quite a bit of tuning and tweaking until you figure out what the "common subset" of properties should be. And if this project is primarily for reporting or analysis purposes, there's always the possibility that there won't be a common subset at all. And that leads you right back to needing multiple serializations again.
dthrasher
@Brian- Even though this is an old question I've added an answer regarding how Amazon does it, as you may find it useful.
RichardOD
+1  A: 

We're at the same decision point on our project. As of right now, we've decided to create three levels of DTOs to handle a Thing: SimpleThing, ComplexThing, and FullThing. We don't know how it'll work out for us, though, so this is not yet an answer grounded in reality.

One thing I'm wondering is if we might learn that our services are designed at the "wrong" level. For example, is there ever an instance where we should bust a FullThing apart and only pass a SimpleThing? If we do, does that imply we've inappropriately put some business logic at too high of a level?

John Deters
+1  A: 

We've faced this problem in object-relational mapping as well. There are situations where we want the full object and others where we want a reference to it.

The difficulty is that by baking the serialization into the classes themselves, the datacontract pattern enforces the idea that there's only one right way to serialize an object. But there are lots of scenarios where you might want to partially serialize a class and/or its child objects.

This usually means that you have to have multiple DTOs for each class. For example, a FullCustomerDTO and a CustomerReferenceDTO. Then you have to create ways to map the different DTOs back to the Customer domain object.

As you can imagine, it's a ton of work, most of it very tedious.

dthrasher
+1  A: 

One other possibility is to treat the objects as property bags. Specify the properties you want when querying, and get back exactly the properties you need.

Changing the properties to show in the "short" version then won't require multiple round trips, you can get all of the properties for a set at one time (avoiding chatty interfaces), and you don't have to modify your data or operation contracts if you decide you need different properties for the "short" version.

kyoryu
Treating properties as objects themselves is an idea I had been pushing. I think this would have allowed the flexibility to pull whichever properties I wanted. This said, it introduces a level of complexity on the system as we are no longer dealing with objects that have native type fields on them, but rather a collection of "field objects". The abstraction gives you flexibility, but at the cost of complexity (and possibly maintainability). I found in my particular situation, the majority of the other developers found this solution to be too complex to grasp.
Brian
The two aren't necessarily exclusive. Your public-facing APIs could have staticly-typed fields, while using a property bag as the underlying implementation. You could also expose (directly or indirectly) the bag to allow for dynamic addition of "fields" while still maintaining static properties for the known static stuff.
kyoryu
To be more specific, there's no reason that the WCF contract has to match the API used by consumers of the object, and in fact, I'd generally suggest that you'll be better off in the long run if you make that explicit up front.
kyoryu
+1  A: 

Amazon Product Advertising API Web service is a good example of the same problem that you are experiencing.

They use different DTOs to provide callers with more or less detail depending on their circumstances. For example there is the small response group, the large response group and in the middle medium response group.

Having different DTOs is a good technique if as you say you don't want a chatty interface.

RichardOD
+1  A: 

I typically build in lazy loading to my complex web services (ie web services that send/receive entities). If a Person has a Father property (also a Person), I send just an identifier for the Father instead of the nested object, then I just make sure my web service has an operation that can accept an identifier and respond with the corresponding Person entity. The client can then call the web service back if it wants to use the Father property.

I've also expanded on this so that batching can occur. If an operation sends back 5 Persons, then if the Father property is accessed on any one of those Persons, then a request is made for all 5 Fathers with their identifiers. This helps reduce the chattiness of the web service.

Travis Heseman
Note using this strategy requires acute awareness of entity identity. You may not want multiple instances of the same Entity floating around; so you may want to create a cache on the client of the entities you've received so far and use the ones you've been given over going back for another copy of them.
Travis Heseman