I have two tables defined like this:
Page(id), Index(page_id, word)
page_id in Index is a foreign key to Page so that each Page is connected to a group of Index entries. The Index table is a index for the Page table so that you can do fast text searching. E.g:
SELECT page_id FROM Index where word = 'hello'
Would select all page_id's for all pages containing the word 'hello'. But now I want to select all page_id's for pages that contain all of the words 'word1', 'word2' and 'word3'. The best query I can come up with for this is:
SELECT page_id
FROM Index
WHERE word IN ('word1', 'word2', 'word3')
GROUP BY page_id
HAVING COUNT(1) = 3;
It works, but I wonder if someone can think of an alternative more efficient query?
[edit] The above example is slightly simplified. In the real Index table word is replaced with a word_id column that references a Word table. But the basic approach is the same. The RDBMS is PostgreSQL and there is about 2m rows in the Index table and 20k rows in Page.