views:

52

answers:

4

Imagine you live in very simplified example land - and imagine that you've got a table of people in your MySQL database:

create table person (
    person_id int,
    name text
)

select * from person;

+-------------------------------+
|   person_id |            name |
+-------------------------------+
|           1 |           Alice |
|           2 |             Bob |
|           3 |           Carol |
+-------------------------------+

and these people need to collaborate/work together, so you've got a link table which links one person record to another:

create table person__person (
    person__person_id int,
    person_id int,
    other_person_id int
)

This setup means that links between people are uni-directional - i.e. Alice can link to Bob, without Bob linking to Alice and, even worse, Alice can link to Bob and Bob can link to Alice at the same time, in two separate link records. As these links represent working relationships, in the real world they're all two-way mutual relationships. The following are all possible in this setup:

select * from person__person;

+---------------------+-----------+--------------------+
|   person__person_id | person_id |    other_person_id |
+---------------------+-----------+--------------------+
|                   1 |         1 |                  2 |
|                   2 |         2 |                  1 |
|                   3 |         2 |                  2 |
|                   4 |         3 |                  1 |
+---------------------+-----------+--------------------+

For example, with person__person_id = 4 above, when you view Carol's (person_id = 3) profile, you should see a relationship with Alice (person_id = 1) and when you view Alice's profile, you should see a relationship with Carol, even though the link goes the other way.

I realize that I can do union and distinct queries and whatnot to present the relationships as mutual in the UI, but is there a better way? I've got a feeling that there is a better way, one where this issue would neatly melt away by setting up the database properly, but I can't see it. Anyone got a better idea?

+2  A: 

There is no better idea. Relational databases cannot enforce what you ask so you have to write a special query to retrieve data and a trigger to enforce the constraint.

To get the related persons for @person I would go for:

SELECT CASE person_id WHEN @person 
           THEN other_person_id 
          ELSE person_id
       END as related_person_id  
FROM person_person 
WHERE (  person_id=@person 
      OR other_person_id=@person)
Ovidiu Pacurar
Always put () around an 'or' in the where clause. Bad things will happen when you forget before adding more restraints! (+1 for the answer)
lexu
+3  A: 

I'm not sure if there is a better way to configure your tables. I think the way you have them is proper and would be the way I would implement it.

Since your relationship table can indicate unidirectional relationships, I would suggest treating them as such. In other words, for every relationship, I would add two rows. If Alice is collaborating with Bob, the table ought to be as follows:

select * from person__person;
+---------------------+-----------+--------------------+
|   person__person_id | person_id |    other_person_id |
+---------------------+-----------+--------------------+
|                   1 |         1 |                  2 |
|                   2 |         2 |                  1 |
+---------------------+-----------+--------------------+

The reason is because in a lot of ActiveRecord (Rails) like systems, the many-to-many table object would not be smart enough to query both person_id and other_person_id. By keeping two rows, ActiveRecord like objects will work correctly.

What you should do is then enforce the integrity of your data at the code level. Everytime a relationship is established between two users, two records should be inserted. When a relationship is destroyed, both records should be deleted. Users should not be allowed to establish relationships with themselves.

shadanan
This also has the merit of the business logic deciding that not all associations are bi-directional with no change needed to the schema.
msw
Done and done! Thanks very much - I'm going with this method, for all the reasons both user239098 and msw stated.
dflock
+2  A: 

There is no way I can see using simple relational concepts. You will have to add "business" code to enforce your person-person relationships.

  • One way might be to enforce a second relationship record using an insert trigger
  • then declare one of the records "primary" (e.g. the one where the person_id is smaller than the other _person_id)
  • then build views and glue code for your applications (select, update, delete) that access the data with this knowledge.
lexu
+2  A: 

You should find this post useful:

http://discuss.joelonsoftware.com/default.asp?design.4.361252.31

As @user posted you are generally better off creating two records per bidirectional relationship (Alice to Bob and Bob to Alice in your example). It makes querying much easier and it accurately reflects the relationships. If you do have true unidirectional relationships, it's the only way to fly.

Rob
Thanks, I read that thread; it basically confirmed the 'just store it twice' solution, followed by a load of shouting. I'm glad stackoverflow isn't like that :)
dflock