views:

32

answers:

4

I've read over a number of posts regarding DB table design for a common one-to-many / users-to-friends scenario. One post included the following:

USERS

* user_id (primary key)
* username

FRIENDS

* user_id (primary key, foreign key to USERS(user_id))
* friend_id (primary key, foreign key to USERS(user_id))

> This will stop duplicates (IE: 1, 2) from happening, but won't stop reversals because (2, 1) is valid. You'd need a trigger to enforce that there's only one instance of the relationship...

The bold portion motivated me to post my question: is there a difference between how SQL Server and MySQL handle these types of composite keys? Do both require this trigger that the poster mentions, in order to ensure uniqueness?

I ask, because up until this point I've been using a similar table structure in SQL Server, without any such triggers. Have I just luckily not run into this data duplication snake that's lurking in the grass?

+2  A: 

You could just use a check constraint that friend_id > user_id to prevent "reversals". This would enforce that it was not possible to enter a pair such as (2, 1) such a relationship would have to be entered as (1, 2).

Martin Smith
So 1,2 and 2,1 *are* considered two unique records? Pardon my ignorance.
PolishedTurd
And are they handled the same in both SQL Server and MySQL?
PolishedTurd
Solves the problem but would require some minor hoop jumping in the client code to always put the lower valued id in a specific column.
Thomas
OMG Ponies
@OMG Ponies - The check constraint simply enforces the rule. Barring database-side CRUD code (i.e. stored procs), *all* client-side code would have to have logic that posts the value into the right order which is obviously awkward for the app developers.
Thomas
@Thomas: if you used stored procedures to insert the data into the table, you could just rewrite the stored proc.
Quassnoi
@Quassnoi - Granted, which is why I excluded sp. If you are using sp for this type of thing, it obviously makes it significantly simpler to implement this sort of logic. That probably won't make the ORM crowd happy but it is still a viable solution and simpler than a trigger.
Thomas
+3  A: 

Yes, all DBMS will treat this the same. The reason is that the DBMS assumes that the column has meaning. I.e., the tuple is not comprised of meaningless numbers. Each attribute has meaning. user_id is assumed to have different meaning than friend_id. Thus, it is incumbent upon the designer to build a rule that claims that 1,2 is equivalent to 2,1.

Thomas
Wow, I can't tell you how many examples/tutorials I've seen that don't make any mention at all of this potential data duplicaton.
PolishedTurd
@PolishedTurd - There is no means by which the db could know that it is a duplication. It is the designer that states that they are equivalent. Imagine the two columns were `manager_id` and `subordinate_id`. The order makes a difference there. You might have 1,2 for one project and 2,1 for another project.
Thomas
This is making a lot more sense now. Man there's a lot of misleading learning material out there.
PolishedTurd
+2  A: 

If you friendship relationship is symmetrical, you need to add a CHECK(user_id < friend_id) into the table definition and insert the data like this:

INSERT
INTO    friends
VALUES  (
        (CASE user_id < friend_id THEN user_id ELSE friend_id END),
        (CASE user_id > friend_id THEN user_id ELSE friend_id END)
        )

In SQL Server, you can build a UNIQUE index on a pair of computed columns:

CREATE TABLE friends (orestes INT, pylades INT, me AS CASE WHEN orestes < pylades THEN orestes ELSE pylades END, friend AS CASE WHEN orestes > pylades THEN orestes ELSE pylades END)

CREATE UNIQUE INDEX ux_friends_me_friend ON friends (me, friend)

INSERT
INTO    friends
VALUES  (1, 2)

INSERT
INTO    friends
VALUES  (2, 1)
-- Fails

To fetch all friends for a given user, you need to run this query:

SELECT  friend_id
FROM    friends
WHERE   user_id = @myuser
UNION ALL
SELECT  user_id
FROM    friends
WHERE   friend_id = @myuser

However, in MySQL, it may be more efficient to always keep each both copies of each pair.

You may find these article interesting:

Quassnoi
True, but MySQL has CHECK constraints which it doesn't enforce on any engine.
OMG Ponies
@OMGPonies: of course. But if you always run the query correctly, you won't need the constraint.
Quassnoi
That article is great. Thanks.
PolishedTurd
+1  A: 

If relationship is symmetrical, then one alternative is to "define" the relationship as asymetrical in the database, but just always add both tuples every time you add either one.

You are basically saying "Nature of friendship is in DB assymetrical, A can be friend to B while B is not friend to A, but application will always add (or remove) BOTH records (a,B) and (B, A) anytime I add (remove) either. That simplifies the query logic as well since you don't have to look in both columns anymore. One extra insert / delete each time you modify data, but fewer reads when querying...

Charles Bretana