views:

559

answers:

10

In the same line as Database Normalization - is there an approach to object normalization, not design pattern, but the same mathematical like approach to normalizing object creation. For example: first normal form: no repeating fields.... here's some links to DB Normalization:

http://en.wikipedia.org/wiki/Database_normalization http://databases.about.com/od/specificproducts/a/normalization.htm

Would this make object creation and self-documentation better?

Here's a link to a book about class normalization (guess we're really talking about classes) http://www.agiledata.org/essays/classNormalization.html

+5  A: 

I guess the Single Responsible Principle is at least related to this. Or at least, violation of the SRP is similar to a lack of normalization in some ways.

(It's possible I'm talking rubbish. I'm pretty tired.)

Jon Skeet
Tired? I thought Jon Skeet does not sleep, he waits.;)
driAn
Oh man, that's dirty
Mark Brittingham
I don't sleep - at least not enough. That's why I'm tired! (Still, it's better than the few months directly after the twins were born...)
Jon Skeet
You have my sympathies; I know how that can be – my two year old had a touch of colic when he was in infant. Car rides were the only thing that helped.
Booji Boy
I probably shouldn't admit this but I fell asleep reading C# in Depth this afternoon...
Mark Brittingham
@Mark: I *hope* that's due to excessive tiredness rather than the writing, but apologies otherwise. If it was in the type inference section of chapter 9, I sympathise :)
Jon Skeet
+2  A: 

At first glance, I'd say that the objectives of Code Refactoring are similar in an abstract way to the objectives of normalization. But that's pretty abstract.

Update: I almost wrote earlier that "we need to get Jon Skeet in on this one." I posted my answer and who beat me? You guessed it...

Mark Brittingham
+3  A: 

Perhaps you're taking this from a relational point-of-view, but I would posit that the principles of interfaces and inheritance correspond to normalization in the world of OOP.

For example, a Person abstract class containing FirstName, LastName, Gender and BirthDate can be used by classes such as Employee, User, Member etc. as a valid base class, without a need to repeat the definitions of those attributes in such subclasses.

The principle of DRY, (a core principle of Andy Hunt and Dave Thomas's book The Pragmatic Programmer), and the constant emphasis of object-oriented programming on re-use, also correspond to the efficiencies offered by Normalization in relational databases.

Jon Limjap
I agree with what you are saying about DRY, although maybe a better example could be used. In my opinion it is not a good idea to subclass on the basis of "instance variables" even if there are properties wrapping the internals. The behavior of a class should be extended, not the internal data.
Raymond Roestenburg
+1  A: 

Object Role Modeling (not to be confused with Object Relational Mapping) is the closest thing I know of to normalization for objects. It doesn't have as mathematical a foundation as normalization, but it's a start.

Bill the Lizard
A: 

In a fairly ad-hoc and untutored fashion, that will probably cause purists to scoff, and perhaps rightly, I think of a database table as being a set of objects of a particular type, and vice versa. Then I take my thoughts from there. Viewed this way, it doesn't seem to me like there's anything particularly special you have to do to use normal form in your everyday programming. Each object's identity will do for starters as its primary key, and references (pointers, etc.) will do by way of foreign keys. Then just follow the same rules.

(My objects usually end up in 3NF, or some approximation thereof. I treat this all more as guidelines, and, like I said, "untutored".)

If the rules are followed properly, each bit of information then ends up in one place, the interrelationships are clear, and everything is structured such that introducing inconsistencies takes some work. One could say that this approach produces good results on this basis, and I would agree.

One downside is that the end result can feel a bit like a tangle of spaghetti, particularly after some time away, and it's hard to shake the constant lingering sensation, even though it's usually false, that surely a few of all these links could be removed...

brone
A: 

Object oriented design is rational but it does not have the same mathematically well-defined basis as the Relational Model. There is nothing exactly equivalent to the well-defined normal forms of database design.

Whether this is a strength or a weakness of Object oriented design is a matter of interpretation.

Joe Soul-bringer
+5  A: 

Normalization has a mathematical foundation in predicate logic, and a clear and specific goal that the same piece of information never be represented twice in a single model; the purpose of this goal is to eliminate the possibility of inconsistent information in a data model. It can be shown via mathematical proof that if a data model has certain specific properties (that it passes tests for 1st Normal Form (1NF), 2NF, 3NF, etc.) that it is free from redundant data representation, i.e. it is Normalized.

Object orientation has no such underlying mathematical basis, and indeed, no clear and specific goal. It is simply a design idea for introducing more abstraction. The DRY principle, Command-Query Separation, Liskov Substitution Principle, Open-Closed Principle, Tell-Don't-Ask, Dependency Inversion Principle, and other heuristics for improving quality of code (many of which apply to code in general, not just object oriented programs) are not absolute in nature; they are guidelines that programmers have found useful in improving understandability, maintainability, and testability of their code.

With a relational data model, you can say with absolute certainty whether it is "normalized" or not, because it must pass ALL the tests for normal form, and they are quite specific. With an object model, on the other hand, because the goal of "understandable, maintainable, testable, etc" is rather vague, you cannot say with any certainty whether you have met that goal. With many of the design heuristics, you cannot even say for sure whether you have followed them. Have you followed the DRY principle if you're applying patterns to your design? Surely repeated use of a pattern isn't DRY? Furthermore, some of these heuristics or principles aren't always even necessarily good advice all the time. I do try to follow Command-Query Separation, but such useful things as a Stack or a Queue violate that concept in order to give us a rather elegant and useful result.

Chris Teixeira
+2  A: 

Interesting.

You may also be interested in looking at the Law of Demeter.

Another thing you may be interested in is c2's FearOfAddingClasses, as, arguably, the same reasoning that lead programmers to denormalise databases also leads to god classes and other code smells. For both OO and DB normalisation, we want to decompose everything. For databases this means more tables, for OO, more classes.

Now, it is worth bearing in mind the object relational impedance mismatch, that is, probably not everything will translate cleanly.

Object relational models or 'persistence layers', usually have 1 to 1 mappings between object attributes and database fields. So can we normalise? Say we have department object with employee1, employee2 ... etc. attributes. Obviously that should be replaced with a list of employees. So we can say 1NF works.

With that in mind, lets go straight for the kill and look at 6NF database design, a good example is anchor modelling, (ignore the naming convention). Anchor modelling/6NF provides highly decomposed and flexible database schemas, how does this translate to OO 'normalisation'?

Anchor modelling has these kinds of relationships:

  • Anchors - unique object IDs.
  • Attributes, which translate to object attributes: (Anchor, value, metadata).
  • Ties - relationships between two or more objects (themselves anchors): (Anchor, Anchor... , metadata)
  • Knots, attributed Ties.

Attribute metadata can be anything - who changed an attribute, when, why, etc.

The OO translation of this is looks extremely flexible:

  • Anchors suggest attribute-less placeholders, like a proxy which knows how to deal with the attribute composition.
  • Attributes suggest classes representing attributes and what they belong to. This suggests applying reuse to how attributes are looked up and dealt with, e.g automatic constraint checking, etc. From this we have a basis to generically implement the GOF-style Structural patterns.
  • Ties and Knots suggest classes representing relationships between objects. A basis for generic implementation of the Behavioural design patterns?

Interesting and desirable properties of anchor modelling that also translate across are:

  • All this requires replacing inheritance with composition (good) in the exposed objects.
  • Attribute have owners rather than owners having attributes. Although this make attribute lookup more complex, it neatly solves certain aliasing problems, as there can only ever be one owner.
  • No need for Null. This translates to clearer null handling. Empty-case attribute classes could provide methods for handling the lack of a particular attribute, instead of performing null-checking everywhere.
  • Attribute metadata. Attribute-level full historisation and blaming: 'play' objects back in time, see what changed, when and why, etc. (if required - metadata is entirely optional)

There would probably be a lot of very simple classes (which is good), and a very declarative programming style (also good).

Thanks for such a thought provoking question, I hope this is useful for you.

A: 

I second the SRP. The Open Closed Principle applies as well to "normalization" although I might stretch the meaning of the word, in that it should be possible to extend the system by adding new implementations, without modifying the existing code. objectmentor about OCP

Raymond Roestenburg
A: 

good question, sorry i can't answer in depth

I've been working on object normalization off and on for over 20 years. It's deep and complicated and beautiful, and is the subject of my second planned book, Object Mechanics II. ONF = Object Normal Form, you heard it here first! ;-)

since potentially patentable technology lurks within, I am not at liberty to say more, except that normalizing the data is the really easy part ;-)

Steven A. Lowe