views:

124

answers:

5

I've got a table of people - an ID primary key and a name. In my application, people can have 0 or more real-world relationships with other people, so Jack might "work for" Jane and Tom might "replace" Tony and Bob might "be an employee of" Rob and Bob might also "be married to" Mary.

What's the best way to represent this in the database? A many to many intersect table? A series of self joins? A relationship table with one row per relationship pair and type, where I insert records for the relationship in both directions?

+2  A: 

Create a separate many-to-many table for each type of relationship.

If you try to represent multiple types of relationships in a single many-to-many table, that's a violation of Fourth Normal Form.


Re comments:

Actually the violation of 4NF would be something like this:

Person1 Person2 Is_Employer Is_Teacher Is_Father
Tom     John     No          No         Yes

If you have a three-column table that lists two people and a relationship type, it's better, but you still have a problem with reciprocal relationships.

Person1 Person2  Rel_type
John     Ann     married

Some people get confused about whether to store two rows, or else store the two people in some kind of consistent order (e.g. lower ID value first). But then there are relationships that are directed, like "employer" where the order means something. And there are relationships with multiple people, like "siblings."

So another way to organize these relationships would be to create a table listing groups, one group per row, and then another table listing people in that group.

Group Rel_type    Group Person
123   siblings    123   Bobby
                  123   Peter
                  123   Greg
                  123   Cindy
                  123   Jan
                  123   Marsha

This works best for relationships that have variable numbers of members, and are reciprocal relationships. Members of a sports team is another example. It's essentially a many-to-many table between the group and the people.

You may need multiple ways to store relationships, to account for all the different types.

Bill Karwin
@Bill - So you're suggesting I do something like http://drop.io/lqtuc46/asset/relationships-png?
Emilio
Right, that's what I'm suggesting.
Bill Karwin
Thanks. Curious on your thoughts to the 2 comments I put in response to @smartali89's answer below if you've got another min.
Emilio
Doing this will allow you maximal control over your relationships. For instance, you can use indexes to insure that spousal relationships are singular, while allowing a person to have multiple replacements. Furthermore, it will allow you to attach more data to individual relationships if you want. However, this comes at a cost of increased development time over the simpler threesome table.
tster
@tster Wouldn't a unique key index on the person1+person2+relationshiptype triplet also insure that spousal relationships are singular?
Emilio
@Emilio, That would block you from having n<->n relationships for other types (like "eats lunch with" or "is friends with").
tster
A: 

You may design a table with the following structure,

person1, relation, person2

now when inserted values into it, for example, if john is husband of kelly, then

john, is husband of, kelly

and to apply same for kelly

kelly, is wife of, john

You will need to define relationship for both persons, but it will yield good result while fetching.

smartali89
@smartali89 - You said, "You will need to define relationship for both persons..." and as in your example you're inserting rows for the relationship in both directions. Does this create any referential integrity concerns? I suppose the alternative would be to just insert the row for a single direction and then in order to get the relationships for any given person, I'd have to run 2 queries - one selecting person1 = "bob" and another selecting person2 = "bob", correct? What are the downsides to this approach?
Emilio
@smartali89 - Continuing on, It seems like even though I'd two queries versus one (I could index to make it fast), if I wrote inserted records just for one direction vs. two I'd avoid potential RI issues later.
Emilio
The downside of inserting one query can be elaborated with this example, if "john is father of tom", then according to your one query methodology, if you search any relationship for "tom", it would result in: "john is father of tom", but the result should be like "tom is son of john"
smartali89
Considering that having a single record avoids some potentially serious RI issues, and the added query time would be for a single index lookup, I would go with a single relationship.
tster
But if I insert just one row and do *two* questions (first on the person1 and then on the person2 field as I was indicating in my first comment) wouldn't that cover me?
Emilio
Yes, but it adds query time (very little) and typing time when you are making the queries (potentially substantial).
tster
A: 

I ran into this situation recently and after trying out a couple different options ended up with something like this (pardon the pseudo-code model):

class Person {
    int Id;
    List<RelationshipMember> Relationships;
}

class RelationshipMember {
    int Id;
    Person RelatedPerson;
}

class Relationship {
    int Id;
    List<RelationShipMember> RelationshipMembers;
}

You can put properties on Relationship to model it's type and properties on the RelationshipMember to model the role within the relationship if that's required.

And of course, this allows for threesomes, too. :)

On this particular project, I'm using an ORM tool (nHibernate with Fluent Automapping), here's how the database tables are expressed:

TABLE Person (
   Id int NOT NULL
)

TABLE Relationship (
   Id int NOT NULL
)

TABLE RelationshipMember(
Id int NOT NULL,
Relationship_id int NOT NULL,
    Person_id int NOT NULL
)
j campbell
@j - How would you represent in the database? ps I like the threesome reference. :)
Emilio
Updated my answer to include the table layout.
j campbell
A: 

make sure you include dates in the link table. since a relationship does not last forever...

**person**
person_id
name

**person_person**
person_id_1
person_id_2
relationship_type_id
begin_date
end_date

**relationship_type**
relationship_type_id
name
Randy
A: 

@bill K :

"If you have a three-column table that lists two people and a relationship type, it's better, but you still have a problem with reciprocal relationships."

Does the solution you first suggested (one table per relationship type) NOT suffer from that very same problem ?

BTW your term ("reciprocal") is incorrect, imo. You are talking of relations (mathematical sense) that have the property of being symmetric. An area that theory leaves answered only very unsatisfactorily, as far as I know.

The three-column option is how it was done in my very first project, almost 30 years ago, and I believe it still is the best approach possible. Especially since "the possible/relevant set of inter-persons relationship types" are, eurhm, a rather volatile kind of thing in any business I can imagine.

Erwin Smout