views:

363

answers:

9

How would one structure a table for an entity that can have a one to many relationship to itself? Specifically, I'm working on an app to track animal breeding. Each animal has an ID; it's also got a sire ID and a dame ID. So it's possible to have a one to many from the sire or dame to its offspring. I would be inclined to something like this:

ID INT NOT NULL PRIMARY KEY
SIRE_ID INT 
DAME_ID INT

and record a null value for those animals which were purchased and added to the breeding stock and an ID in the table for the rest.

So:

  1. Can someone point me to an article/web page that discusses modeling this sort of relationship?
  2. Should the ID be an INT or some sort of String? A NULL in the INT would indicate that the animal has no parents in the database but a String with special flag values could be used to indicate the same thing.
  3. Would this possibly be best modeled via two tables? I mean one table for the animals and a separate table solely indicating kinship e. g.:

    Animal

    ID INT NOT NULL PRIMARY KEY

    Kinship

    ID INT NOT NULL PRIMARY KEY FOREIGN KEY

    SIRE_ID INT PRIMARY KEY FOREIGN KEY

    DAME_ID INT PRIMARY KEY FOREIGN KEY

I apologize for the above: my SQL is rusty. I hope it sort of conveys what I'm thinking about.

+4  A: 

Well, this is a "normal" one-to-many relationship and the method you suggest is the classical one for solving it.

Note that two tables are denormalized (I can't point out exactly where the superkey-is-not-well-should-be-subset-of-other-key-fsck-I-forgot part is, but I'm pretty sure it's there somewhere); the intuitive reason is that a tuple in the first one matches at most a tuple in the second one, so unless you have lots of animals with null sire and dame IDs, it's not a good solution in any prospect (it worsens performance -- need a join -- and does not reduce storage requirements).

millenomi
Thanks so much--this is exactly what I was hoping to get.
Onorio Catenacci
normalized forms, at least 3 first of them, talk about using the correct table and the correct pk.a) avoiding merging a subset of columns with their own functional dependency into a bigger table (2NF) b) avoiding transitive relationship with pk. note that here it's not the case so it's perfect.
helios
I'm pretty sure this counts as a degenerate case of transitivity.
millenomi
+1  A: 

INT is the better choice for the ID column and better suited if you should use a sequence to generate the unique IDs.

I don't see any benefit in splitting the design into two tables.

Lost in Alabama
The benefit in splitting the design into two tables is simply to normalise the data structure. It usually allows optimal use of SQL functionalities.
Philippe Grondier
+3  A: 

I asked a similar question a number of months ago on the MySQL website. I would recommend that you take a look at the response that I received from Peter Brawley regarding this type of relationship: http://forums.mysql.com/read.php?135,187196,187196#msg-187196

If you want to research the topic further then I would recommend that you look into Tree Hierarchies on Wikipedia.

An alternate suggested architecture (that would be fully normalized) would look something like the following:

Table: animal

ID | Name | Breed

Table: pedigree

animal_id | parent_id | parentType (either sire or dame)

Noah Goodrich
@gabriel1836, Thanks for the link--that's exactly the sort of thing I was looking for.
Onorio Catenacci
Table pedigree can also old other interesting data such as inseminationDate, etc ...
Philippe Grondier
A: 

Use the "connect by" clause with SQL to tell it which hierarchy to follow.

dacracot
A: 

It's not really a one to many relationship, unless an animal can have many parents.

I would leave it as a single table with the unique key ID for the animal, one int field for each of the parents, and probably a text field to use for general notes about the animal, like where it was purchased if that's the case.

You're looking at it the wrong way. One parent to many offspring. :-)
Onorio Catenacci
A: 

I think that since it is clear that an animal only has one sire and one dam, that using a single table would make the most sense. My preference is to use int or bigint as the row identifier, with a null value signifying no relationship. I would probably, then, to use some other method to uniquely identify animals so they don't end up in the table twice and create a unique index on that column as well.

tvanfosson
+2  A: 

I think your layout using just one table is fine. You definitely want to keep SIRE_ID and DAME_ID in the same data type as ID. You also want to declare them as FOREIGN KEYs (it is possible to have a foreign key point back to the same table, and a foreign key can also be null).

ID INT NOT NULL PRIMARY KEY
SIRE_ID INT REFERENCES TABLENAME (ID)
DAME_ID INT REFERENCES TABLENAME (ID)

Using this layout, you can easily look up the parent animals, and you could also build an offspring tree for a given animal (for Oracle there is CONNECT BY)

Thilo
A: 

Seems like you want to build something like a tree.

What about something like?:

 ID          Primary Key,
 Parent_ID   Foreing_Key
 ( data )

There are some functionality for doing querys in tables with relations to themselves. See the syntax of Connect By: http://www.adp-gmbh.ch/ora/sql/connect_by.html

borjab
note that there are two parents, though, DAME and SIRE
Thilo
+1  A: 

I don't know about animal breeding, but it sounds like your Sire_ID is the father and Dame_ID is the mother? No problem. One row per animal, null sire_ and dame_ID's for purchased animals, I don't forsee any problems.

[ID],[Sire_ID],[Dame_ID];
0,null,null  (male)
1,null,null  (female)
2,null,null  (female)
3,0,1 (male)
4,0,2 (male)
5,null,null  (female)
6,3,5
7,4,5

and so forth. You would likely populate a TreeView or XmlNodeList in a while loop...

While (myAnimal.HasChildren) {
 Animal[] children = GetChildren(Animal.ID)
 for (int x=0; x<children.length; x++) 
  myAnimal.Children.Add(children[x]);
}

In this case, Animal.Children is a Collection of Animals. Therefore, myAnimal.Children[0].Father would return myAnimal. .Parent[] could be a collection of its two parents, which should work as long as [0] is always one parent (father) and [1] is always the other (mother).

Make ID an Autonumber PK and assign Sire_ID and Dame_ID programatically by returning the IDs of its parents. No foreign key relationships should be neccessary though both parent IDs could reference back to ID if you really want to.

tsilb