views:

107

answers:

2

I am just learning normalization, so please forgive me if this is a dumb question.

I have a table TBL_Users with the Primary Key being ID. To track who is friends with who my most recent thought was to do a table with two Foreign Keys, both of which are the other person's ID. However the more I think about this I can't help but think there has got to be a better way.

+----+------+
| ID | Name |
+----+------+
| 1  | Al   |
| 2  | Bob  |
+----+------+

That model means either I have to duplicate all the information or call the TBL_Friends twice.

IE if the table is

+----+--------+
| ID | Friend |
+----+--------+
| 1  | 2      |
| 2  | 1      |
+----+--------+

Then I have duplicated information and have to make two calls to add/delete friends.

On the other hand if I just do

+----+-----+
| ID | ID2 |
+----+-----+
| 1  | 2   |
| 3  | 1   |
| 4  | 1   |
+----+-----+

The situation seems to be even worse because I have to query the database twice any time I want to do anything, be it gather information or add/delete friends.

Surely there is a simpler solution I am overlooking?

+1  A: 

You don't need to use two queries, just use one query with an OR clause.

SELECT
    (CASE WHEN
       WHEN id1 = XXX THEN id2
       ELSE id1
    END) AS friend_id
WHERE
   id1 = XXX OR id2 = XXX

Where XXX is the ID of the user you're looking up. That fits the simple case you have provided.

If your model gets much more complex we can look at other solutions of tables and/or de-normalisation like your first solution.

catchdave
And I agree with @APC with his answer below. My solution is based on the assumption that [1,2] is equivalent to [2,1]. This may or may not be the case.
catchdave
A UNION would be better; that way you can index on ID1 and index on ID2 and the database can run both queries quickly. An OR makes it more difficult for it to use indexes.
vincebowdren
+1  A: 

The question you need to answer is this: are the following two statements equivalent?

  1. Bob is a friend of Al
  2. Al is a friend of Bob

It depends on context. In social networking sites Al and Bob are just nodes on a graph, and as long as there is a link between them that suffices.

But if Al is stalking Bob then Al might assert statement #1 as much as he likes, Bob is never going to agree with statement #2. Or consider an analogous statemen:

  1. Bob is the manager of Al
  2. Al is the manager of Bob

It is uncommon that both those statements can be true simultaneously but there are some complicated managerial structures out there.

In both these situations your first table does not contain duplicate data, because (1,2) is not the same as (2,1). If you do go for the second solution you ought to enforce a rule that if (1,2) exists, (2,1) cannot exist.

There are situations in which your first solution is the appropriate one and some in which the second is the right one. In other words, data modelling is hard :)

The key thing is, first get your logical model correct. Forget about the SQL until it comes to writing the queries. If your tables are designed correctly the SQL will flow. Or to put it another way, if you are finding it hard to write the query the chances are your data model is wrong.

APC
I am planning on doing an "add friend" step (which will be a different, but similar table). That should be simpler because as requests are accepted the info will be deleted from the requests table and added to the friends table. Thanks.
aslum