ansaurus

Question

What is the proper object relationship? (C#)

Answer 1

+1 A:

If I understand your question correctly, I think a solution to your problem might be an OR mapper. Microsoft provides two OR mappers at the moment, LINQ to SQL and Entity Framework. If you are using .NET 3.5, I recommend using LINQ to SQL, but if you are able to experiment with .NET 4.0, I would highly recommend looking into Entity Framework. (I discourage the use of Entity Framework in .NET 3.5, as it was released very prematurely and has a LOT of problems.)

Both of these OR mappers provide visual modeling tools that allow you to build a conceptual entity model. With LINQ to SQL, you can generate a model from your database, which will provide you with entity classes, as well as associations between those classes (representing your foreign keys from your DB schema). The LINQ to SQL framework will handle generating SQL queries for you, and will automatically map database query results into object graphs. Relationships such as the one you described, with multiple customers in a set referencing the same single department are handled automatically for you, you don't need to worry about them at all. You also have the ability to query your database using LINQ, and can avoid having to write a significant amount of stored procedures and plumbing/mapping code.

If you use .NET 4.0, Entity Framework is literally LINQ to SQL on steroids. It supports everything LINQ to SQL does, and a hell of a lot more. It supports model-driven design, allowing you to build a conceptual model from which code AND database schema are generated. It supports a much wider variety of mappings, providing a much more flexible platform. It also provides Entity SQL (eSQL), which is a text-based query language that can be used to query the model in addition to LINQ to Entities. Line LINQ to SQL, it will solve the scenario you used as an example, as well as many others.

OR mappers can be a huge time, money, and effort saver, greatly reducing the amount of effort required to interact with a relational database. They provide both dynamic querying as well as dynamic, optimistic updates/inserts/deletes with conflict resolution.

jrista 2009-11-25 01:56:08

Linq to SQL should only be recommended when you know SQL is in use. Additionally it doesn't seem to answer the fundamental question of "How do I track it all in memory so that I don't have to go through the customer object to get to depots [and vice versa]?"

Jason D 2009-12-01 05:45:25

@Jason D: L2S was only one of the options. The general gist of the post was to use an OR mapper, which absolutely solves the problem of the question posed, which was tracking object instances such that each customer object did not reference its own copy of the depot entity, using a shared instance. There are a variety of manual solutions to the problem, but they require CONSIDERABLE effort to implement and maintain. An ORM is the most cost effective answer that won't break single responsibility or separation of concerns. L2S and EF are readily available, free, and easy to use.

jrista 2009-12-01 06:00:17

It does not necessarily have to be L2S or EF either. I probably should have mentioned nHibernate, SubSonic, Telerik OpenAccess. There are other ORM options that generally do the same as L2S (or EF in the future), although not quite as simply and powerfully. An ORM takes care of mapping, separation of domain and persistence concerns, lazy loading (or explicit load on demand if you prefer), state tracking, and sql generation for both retrieval and updates. Memory is utilized efficiently without duplicating instances of objects. In the case of L2S and EF v4, efficiency of SQL is very high.

jrista 2009-12-01 06:04:04

Answer 2

+1 A:

This sounds like you've got a Many-to-many relationship going on. (Customers know about their Depots, and vice versa)

Ideally this seems best suited for a database application where you define a weak-entity table ... Of course using a database is overkill if we're talking about 10 Customers and 10 Depots...

Assuming a database is overkill, this can be modeled in code with some Dictionarys. Assuming you're using int for the unique identifiers for both Depot and Customer you could create something like the following:

// creating a derived class for readability.
public class DepotIDToListOfCustomerIDs : Dictionary<int,List<int>> {}
public class CustomerIDToListOfDepotIDs : Dictionary<int,List<int>> {}
public class DepotIDToDepotObject : Dictionary<int,Depot>{}
public class CustomerIDToCustomerObject : Dictionary<int, Customer>{}
//...
// class scope for a class that manages all these objects...
DepotIDToListOfCustomerIDs _d2cl = new DepotIDToListOfCustomerIDs();
CustomerIDToListOfDepotIDs _c2dl = new CustomerIDToListOfDepotIDs();
DepotIDToDepotObject _d2do = new DepotIDToDepotObject();
CustomerIDToCustomerObject _c2co = new CustomerIDToCustomerObject();
//...
// Populate all the lists with the cross referenced info.
//...
// in a method that needs to build a list of depots for a given customer
// param: Customer c
if (_c2dl.ContainsKey(c.ID))
{
    List<int> dids=_c2dl[c.ID];
    List<Depot> ds=new List<Depot>();
    foreach(int did in dids)
    {
        if (_d2do.ContainsKey(did))
            ds.Add(_d2do[did]);
    }
}
// building the list of customers for a Depot would be similar to the above code.

EDIT 1: note that with the code above, I've crafted it to avoid circular references. Having a customer reference a depot that also references that same customer will prevent these from being quickly garbage collected. If these objects will persist for the entirety of the applications lifespan a simpler approach certainly could be taken. In that approach you'd have two lists, one of Customer instances, the other would be a list of Depot instances. The Customer and Depot would contain lists of Depots and Customers respectively. However, you will still need two dictionaries in order to resolve the Depot IDs for the customers, and vice versa. The resulting code would be 99% the same as the above.

EDIT 2: As is outlined in others replies you can (and should) have an object broker model that makes the relationships and answers questions about the relationships. For those who have misread my code; it is by no means intended to craft the absolute and full object model for this situation. However, it is intended to illustrate how the object broker would manage these relationships in a manner that prevents circular references. You have my apologies for the confusion it caused on the first go around. And my thanks for illustrating a good OO presentation that would be readily consumed by others.

Jason D 2009-11-25 03:43:47

OO systems have no trouble implementing many-many relationships. Your answer is good up until the code.

Robert Paulson 2009-11-25 20:49:18

Robert, could you be a bit more precise please?

Jason D 2009-11-26 01:52:33

You can change it such that Customers and Depots are constructed through static methods only, which can guarantee singleton semantics about the unique ID. That way you don't need strange sets of Dictionaries, which is what @Robert may be pointing out.

sixlettervariables 2009-11-26 02:06:09

Sixlet: So, internally, what would the static methods be using to map the Depot and Customer IDs to a specific instance of their respective objects -- thereby meeting the requirement that multiple instances aren't created?

Jason D 2009-11-26 02:48:29

Additionally sixlet, how would you prevent circular object references with the technique you're proposing?

Jason D 2009-11-26 02:51:34

Your answer makes it seem m:n relationships are hard in OO. They're not. My point RE code was that the (premature) optimisations you have attempted are tangential to the discussion at hand. OP's question was not "What are possible techniques to avoid circular references and prevent premature garbage collection". I realize SO is terrible for more than 5 lines of code, but: 1) the code quality isn't great 2) your code really detracted from your other comments and 3) what the code was doing didn't flow logically from the text you had already written.

Robert Paulson 2009-11-26 23:56:56

Answer 3

A:

In reply to @Jason D, and for the sake of @Nitax: I'm really skimming the surface, because while it's basically easy, it also can get complicated. There's no way I'm going to re-write it better than Martin Fowler either (certainly not in 10 minutes).

You first have to sort out the issue of only 1 object in memory that refers to a specific depot. We'll achieve that with something called a Repository. CustomerRepository has a GetCustomer() method, and the DepotRepository has a GetDepot() method. I'm going to wave my hands and pretend that just happens.

Second you need to need to write some tests that indicate how you want the code to work. I can't know that, but bear with me anyways.

// sample code for how we access customers and depots
Customer customer = Repositories.CustomerRepository.GetCustomer("Bob");
Depot depot = Repositories.DepotRepository.GetDepot("Texas SW 17");

Now the hard part here is: How do you want to model the relationship? In OO systems you don't really have to do anything. In C# I could just do the following.

Customers keep a list of the depots they are with

class Customer
{
    public IList<Depot> Depots { get { return _depotList; } }
}

alternatively, Depots keep a list of the customers they are with

class Depot
{
    public IList<Customer> Customers { get { return _customerList; } }
}
// * code is very brief to illustrate.

In it's most basic form, any number of Customers can refer to any number of Depots. m:n solved. References are cheap in OO.

Mind you, the problem we hit is that while the Customer can keep a list of references to all the depot's it cares about (first example), there's not an easy way for the Depot to enumerate all the Customers.

To get a list of all Customers for a Depot (first example) we have to write code that iterates over all customers and checks the customer.Depots property:

List<Customer> CustomersForDepot(Depot depot)
{
    List<Customer> allCustomers = Repositories.CustomerRepository.AllCustomers();
    List<Customer> customersForDepot = new List<Customer>();

    foreach( Customer customer in allCustomers )
    {
        if( customer.Depots.Contains(depot) )
        {
            customersForDepot.Add(customer);
        }
    }
    return customersForDepot;
}

If we were using Linq, we could write it as

var depotQuery = from o in allCustomers
                 where o.Depots.Contains(depot)
                 select o;

return query.ToList();

Have 10,000,000 Customers stored in a database? Ouch! You really don't want to have to load all 10,000,000 customers each time a Depot needs to determine its' customers. On the other hand, if you only have 10 Depots, a query loading all Depots once and a while isn't a big deal. You should always think about your data and your data access strategy.

We could have the list in both Customer and Depot. When we do that we have to be careful about the implementation. When adding or removing an association, we need to make the change to both lists at once. Otherwise we have customers thinking they are associated with a depot, but the depot doesn't know anything about the customer.

If we don't like that, and decide we don't really need to couple the objects so tightly. We can remove the explicit List's and introduce a third object that is just the relationship (and also include another repository).

class CustomerDepotAssociation
{
    public Customer { get; }
    public Depot { get; }
}

class CustomerDepotAssociationRepository
{
    IList<Customer> GetCustomersFor(Depot depot) ...
    IList<Depot> GetDepotsFor(Customer customer) ...
    void Associate(Depot depot, Customer customer) ...
    void DeAssociate(Depot depot, Customer customer) ...
}

It's yet another alternative. The repository for the association doesn't need to expose how it associates Customers to Depots (and by the way, from what I can tell, this is what @Jason D's code is attempting to do)

I might prefer the separate object in this instance because what we're saying is the association of Customer and Depot is an entity unto itself.

So go ahead and read some Domain Driven Design books, and also buy Martin Fowlers PoEAA (Patterns of Enterprise Application Architecture)

Robert Paulson 2009-11-27 00:56:00

At its core you'll have a dictionary (or hashtable) managing the relationships. But in light of what the OP was asking it seemed to me to be more a question of "what objects are good for managing m:n relationships;" not "how to I veil the internals of the relationship manager so that they're easy to use." (But it is a good question and you gave a good answer...)

Jason D 2009-11-27 01:26:44

One note about using linq, it generally has performance issues. Many people have posted asticles about it. I initially thought to post a linq answer but quickly realized I'd be telling people to do what's currently biting us in the hind end at my job...

Jason D 2009-11-27 01:32:01

Yes Linq (to Sql) can have performance issues, but I wouldn't dismiss it out of hand. We use it all the time, and yes you have to be careful because, just like sql, you can write some shockingly performance poor queries. Linq itself though is fantastic.

Robert Paulson 2009-11-27 01:47:16

@Robert, For the situation described, unless the dataset is small I would advise against using it... Learn SQL and learn to write it correctly. You'll do more to improve performance that way. Auto-generated high-level language code has always suffered from performance issues compared to hand written code. The same is true of assembly. The reason we don't code in assembly is the orders of mangitude performance increase in processor power. Until network throughput and processor powers increase a few orders of magnitude auto generated high-level code will too.

Jason D 2009-12-12 14:08:25

@Jason, you're prematurely optimising, and while your concerns are valid, they are at this stage jumping the gun. OP was asking how to model relationships in code, which is what I answered. I attempted to answer in such a way that OP thinks about how they implement, because data access pattern is usually the driver. That said, nothing I have written, aside from showing linq query, doesn't necessarily use autogenerated code, so I don't know what that is a sore point with you. By all means OP should write unit tests for performance and optimise where necessary.

Robert Paulson 2009-12-13 02:02:04

@Robert, It's a sore point because autogenerated code has bitten me in the arse too many times. It's always been done where someone didn't bother to find out what the potential limitations were, nor did they investigate the complexity of such code. When questioned these people cite help websites such as this one that did it and assumed it was safe under all circumstances because it didn't come with a disclaimer. It's a sore point because those who understand its pitfalls usually are the ones to clean up after those who don't.

Jason D 2009-12-13 21:03:20

@Jason. I agree with your comments RE copy/paste code and the tautological statement of people who do things without understanding what they do regularly leads to inefficiencies. However it still has little bearing on my answer! You have your answer, I have mine, and this ongoing dialog isn't productive in the slightest. If you want to express your ideas re autogenerated code and your issues with linq, create a new answer (or comment to @jrista instead).

Robert Paulson 2009-12-13 22:09:02

Answer 4

A:

OO:
alt text

ER:
alt text

Damir Sudarevic 2009-11-28 19:35:33

ansaurus

tags:

views:

answers:

What is the proper object relationship? (C#)

related questions