There's some uncertainty as to whether you want to prevent duplicates from being inserted into the database. You might just want to fetch unique pairs, while preserving the duplicates.
So here's an alternative solution for the latter case, querying unique pairs even if duplicates exist:
SELECT r1.*
FROM Relationships r1
LEFT OUTER JOIN Relationships r2
ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1)
WHERE r1.person_1 < r1.person_2
OR r2.person_1 IS NULL;
So if there is a matching row with the id's reversed, there's a rule for which one the query should prefer (the one with id's in numerical order).
If there is no matching row, then r2 will be NULL (this is the way outer join works), so just use whatever is found in r1 in that case.
No need to use GROUP BY
or DISTINCT
, because there can only be zero or one matching rows.
Trying this in MySQL, I get the following optimization plan:
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
| 1 | SIMPLE | r1 | ALL | NULL | NULL | NULL | NULL | 2 | |
| 1 | SIMPLE | r2 | eq_ref | PRIMARY | PRIMARY | 8 | test.r1.person_2,test.r1.person_1 | 1 | Using where; Using index |
+----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+
This seems to be a reasonably good use of indexes.