tags:

views:

33

answers:

3

I have models corresponding to database tables. For example, the House class has "color", "price", "square_feet", "real_estate_agent_id" columns.

It is very common for me to want to display the agent name when I display information about a house. As a result, my House class has the following fields:

class House {
  String color;
  Double price;
  Integer squareFeet;
  Integer realEstateAgentId;

  String realEstateAgentName;
}

I've been referring to realEstateAgentName as a virtual field, as it is pulled from a foreign table (join on real_estate_agent_id).

This doesn't feel right to me, as it mixes actual database columns with foreign object's properties. But it's quick, and in many cases it really works out well.

Other times I find myself doing something like this:

class House {
  String color;
  Double price;
  Integer squareFeet;
  Integer realEstateAgentId;

  RealEstateAgent realEstateAgent;
}

As you can see, I'm storing the actual object corresponding to the ID that is stored in the House table.

I tend to make the decision to store the entire object vs some key information associated with the ID (e.g. Name) depending on the likelihood I see of needing to access other information about the object it represents.

I have a few questions:

Of the two methods I've been mixing and matching, which is best? I'm leaning towards storing the id + the object, rather than pulling out just the properties from the foreign object that I think I may need. Of the two, this seems more "correct." But it's not perfect, because in many cases I don't have any need to hydrate the entire foreign object, and doing so would cause undue waste of resources or would not be feasible because of the amount of data or the number of joins that would be required when I don't have any use for all the info being brought in. Given that this is the case, it seems like a poor design choice because I will have lots of null fields that aren't really null in my database, but are so in memory simply because there was no need to populate them -- now I have to keep track of which ones I populated.

But is it best practice to store an ID alongside the object it represents? Should I even be storing the object as a property, or should it live externally in some map, with the ID being the key?

In an Object world it seems like the ID shouldn't even be stored as a property, with the foreign Object it represents being the logical replacement. But with everything being tightly coupled with a relational database it doesn't seem very feasible.

Is this frustrating impurity of my models/classes something I just have to live with, or are there patterns out there that address this by having some kind of fork or parent/child subclassing going on where one is a "pure" object while the other is flat like the database?

EDIT: I am looking for design suggestions here rather than specific ORM frameworks like Hibernate/nHibernate/etc. The particular language I'm working in does not have an ORM solution for my language version that I am satisfied with, and the examples were Java-esque but that's not what my source code is written in.

A: 

LINQ to SQL uses ID + Object and it works out well. I prefer that model as it's most flexible. Hibernate can do the same. One issue you will face is deep loading: when do you actually load the object and not just the ID? Both LINQ to SQL and Hibernate have lazy loading and give you control over this issue.

The Entity Framework however looks to give you this complete control where you can decide just how the data appears regardless the physical underpinnings. It has not been fully realized yet however.

There's really no impurity going on here. The problem is you're trying to represent an abstraction of data that is relationship in an object oriented fashion. To get around the pains of developing like this, larger scale projects are moving to Domain Driven Design where the underlying data is abstracted out into logical groupings of Repositories. Thinking in tables as classes can be problematic for large scale solutions.

Just my 2 cents.

Nissan Fan
A: 

I can tell about Hibernate, because this is the ORM tool I am most familiar with. I believe that other ORM tools also support similar behaviour to some extent.

Hibernate solves your problem with lazy loading. You add your agent as a property to the house, and by default, when the house object is loaded, the agent is represented by a proxy object generated by Hibernate, which contains only the ID. If you query some other property of the agent, Hibernate loads the full object in the background:

class House {
    String color;
    Double price;
    Integer squareFeet;
    RealEstateAgent realEstateAgent;
    // getters, setters,...
}

House house = (House) session.load(House.class, new Long(123));
// at this point, house refers to a proxy object created by Hibernate
// in the background - no house or agent data has been loaded from DB
house.getId();
// house still refers to the proxy object
RealEstateAgent agent = house.getRealEstateAgent();
// house is now loaded, but agent not - it refers to a proxy object
String name = agent.getName(); // Now the agent data is loaded from DB

OTOH if you are sure that for a specific class you (almost) always need a specific property, you can specify eager loading in the ORM mapping for that property, in which case the property is loaded as soon as the containing object. In the mapping you can also specify whether you want a join query or a subselect query.

Péter Török
I've used Hibernate before but it seemed like overkill in many situations. I'm just wondering if there is some simple design that, while not as sophisticated as a full-fledged ORM solution, can bring some logical separation between these two views of the table/object structure.
RenderIn
A: 

Hibernate, the most popular ORM tool in the Java ecosystem, usually allows you to do this:

class House {
 String color;
 Double price;
 Integer squareFeet;
 RealEstateAgent realEstateAgent;
}

This translates to a DB-table that looks like this: house(id, color, price, squareFeet, real_estate_agent_id)

If you need to print the name of the agent you just walk traverse the object graph:

house.getRealEstatAgent().getName()

Through lazy loading, this is done quite efficiently. I wouldn't worry about the fact that an extra query trip to the database may have to be done until your stress tests prove this to be a problem.

Edit after your edit: All the solutions out there have dealt with the paradigm mismatch (between the OO and Relational worlds) in a similar fashion. The designs have been made, the problem is solved. And yes, it remains a pain in the butt to deal with as an application developer but I suppose it is just the way it is as long as we want to use relational databases and object oriented persistence together.

Hans Westerbeek