views:

208

answers:

2

Is there a good way to implement many-to-many relation between rows in single table?

Example: table to store word synonyms:

-- list of words
CREATE TABLE word (
    id    integer      PRIMARY KEY,
    word  varchar(32)  NOT NULL UNIQUE
);
INSERT INTO words (id, word) VALUES (1, 'revolve');
INSERT INTO words (id, word) VALUES (2, 'rotate');

-- M:M link between words
CREATE TABLE word_link (
    word1  integer      REFERENCES word(id) NOT NULL,
    word2  integer      REFERENCES word(id) NOT NULL,
    PRIMARY KEY (word1, word2)
);

Obvious solution results in probably not-1NF table, containing duplicate data:

INSERT INTO word_link(word1, word2) VALUES (1, 2);
INSERT INTO word_link(word1, word2) VALUES (2, 1);

While duplication can be dealt by adding (word1 < word2) check, it makes SELECTs much more complex (union comparing to trivial join) and is pretty arbitrary. This specific case can benefit from auxiliary table (such as 'meaning', so words are M:N linked to common meaning and not to each other, giving cleaner schema), but I'm interested in some general solution.

So is there a better (and hopefully common) way to implement such M:M relation?

A: 

I'd create a view that was the following:

select distinct
    case when word1 < word2 then word1 else word2 end as word1,
    case when word1 < word2 then word2 else word1 end as word2
from
    word_link

That way, you always have a clean, no duplicate list that's easy to select from. I've found that's about as clean of a way as you can have to do a many-to-many relationship.

Eric
ummm... your case statement has the same criteria for both cases. I presume you meant word2< word1 for the second one
Nathan Koop
@Nathan: Same condition, different results. The first column always uses the smaller word, while the second condition always uses the largest.
Eric
ah I understand now
Nathan Koop
Problem isn't in getting rid of duplicate word_link's (CHECK CONSTRAINT can do all the work), but in that they are required for queries: "SELECT w.* FROM word w JOIN wold_link wl ON w.id=wl.word2 AND wl.word1=2" won't return (1, 'revolve'). So choice is between bloated queries "... ON (w.id=wl.word2 AND wl.word1=2) OR (w.id=wl.word1 AND wl.word1=1)" and bloated table with set of consistency establishing triggers
ymv
+1  A: 

In this case I'd add a CHECK CONSTRAINT on UPDATE and on INSERT to enforce that word1 is always less than word2 and vice-versa.

Felipe Lima