I'm dealing with a database where items are "tagged" a certain number of times.
item (100k rows)
- id
- name
- other stuff
tag (10k rows)
- id
- name
item2tag (1,000,000 rows)
- item_id
- tag_id
- count
I'm looking for the fastest solution to:
Select items that have been tagged as X, Y, and Z (where X, Y, and Z correspond to (possibly) tag names) ?
Here's what I have so far... I'd just like to make sure I'm doing it in the best way possible:
First get the tag_ids from the names:
SELECT tag.id WHERE name IN ("X","Y","Z");
Then I group by those tag_ids and use Having to filter the result:
SELECT item2tag.*, count(tag_id)
FROM item2tag
WHERE tag_id=1 or tag_id=2 or tag_id=3
GROUP BY item_id
HAVING count(tag_id)=3;
Then I can just select from item with those ids.
SELECT * FROM item WHERE id IN ([results from prior query])
I have millions of rows in item2tag, with an index on (item_id, tag_id). Is this going to be the fastest solution?