views:

464

answers:

3

I have three tables: videos, videos_categories, and categories.

The tables look like this:

videos: video_id, title, etc...
videos_categories: video_id, category_id
categories: category_id, name, etc...

In my app, I allow a user to multiselect categories. When they do so, I need to return all videos that are in every selected category.

I ended up with this:

SELECT * FROM videos WHERE video_id IN (
    SELECT c1.video_id FROM videos_categories AS c1
    JOIN c2.videos_categories AS c2
    ON c1.video_id = c2.video_id
    WHERE c1.category_id = 1 AND c2.category_id = 2
)

But for every category I add to the multiselect, I have to add a join to my inner select:

SELECT * FROM videos WHERE video_id IN (
    SELECT c1.video_id FROM videos_categories AS c1
    JOIN videos_categories AS c2
    ON c1.video_id = c2.video_id
    JOIN videos_categories AS c3
    ON c2.video_id = c3.video_id
    WHERE c1.category_id = 1 AND c2.category_id = 2 AND c3.category_id = 3
)

I can't help but feel this is the really wrong way to do this, but I'm blocked trying to see the proper way to go about it.

A: 

Here's a FOR XML PATH solution:

--Sample data
CREATE TABLE Video
    (
    VideoID int,
    VideoName varchar(50)
    )

CREATE TABLE Videos_Categories
    (
    VideoID int,
    CategoryID int
    )

INSERT  Video(VideoID, VideoName)
SELECT  1, 'Indiana Jones'
UNION ALL
SELECT  2, 'Star Trek'

INSERT Videos_Categories(VideoID, CategoryID)
SELECT 1, 1
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 2, 1
GO

--The query
;WITH   GroupedVideos
AS
(
SELECT  v.*,
     SUBSTRING(
        (SELECT  (', ') + CAST(vc.CategoryID AS varchar(20))
        FROM  Videos_Categories AS vc
        WHERE  vc.VideoID = v.VideoID
        AND   vc.CategoryID IN (1,2)
        ORDER BY vc.CategoryID
        FOR XML PATH('')), 3, 2000) AS CatList
FROM    Video AS v
)

SELECT  *
FROM    GroupedVideos
WHERE   CatList = '1, 2'

(Ignore everything below - I misread the question)

Try

WHERE c1.category_id IN (1,2,3)

or

...
FROM videos v
JOIN Vedeos_categories vc ON v.video_id = vc.video_id
WHERE vc.category_id IN (1,2,3)

Multiple joins aren't at all necessary.

Edit: to put the solutions in context (I realize it's not obvious):

SELECT * 
FROM videos 
WHERE video_id IN 
(    SELECT c1.video_id 
FROM videos_categories AS c1
WHERE c1.category_id = IN (1,2,3))

or

SELECT *
FROM videos v
JOIN Vedeos_categories vc ON v.video_id = vc.video_id
WHERE vc.category_id IN (1,2,3)
Aaron Alton
There's one small problem with that approach: it includes all videos that are in ANY of those categories. I want only the videos which are in ALL of those categories.
ironkeith
Hahaha...I was driving home and it hit me that I didn't read your question properly. I've amended the solution to include a FOR XML PATH variant. KM's solution is excellent as well.
Aaron Alton
+3  A: 

if this is a primary key:

 videos_categories: video_id, category_id

then a GROUP BY and HAVING should work, try this:

SELECT
    * 
    FROM videos 
    WHERE video_id IN (SELECT 
                           video_id
                           FROM videos_categories
                           WHERE category_id IN (1,2,3)
                           GROUP BY video_id
                           HAVING COUNT(video_id)=3
                      )
KM
I think you meant: HAVING COUNT(video_id)>2 but otherwise that will work perfectly. Thank you.
ironkeith
@ironkeith, good catch, "=3" is best
KM
A: 

Sounds similar to http://stackoverflow.com/questions/863053/sql-searching-for-rows-that-contain-multiple-criteria/863377#863377

To avoid having to another join for each category (and hence changing the structure of the query), you can put the categories into a temp table and then join against that.

CREATE TEMPORARY TABLE query_categories(category_id int);
INSERT INTO query_categories(category_id) VALUES(1);
INSERT INTO query_categories(category_id) VALUES(2);
INSERT INTO query_categories(category_id) VALUES(3);

SELECT * FROM videos v WHERE video_id IN (
  SELECT video_id FROM video_categories vc JOIN query_categories q ON vc.category_id = qc.category_id
  GROUP BY video_id
  HAVING COUNT(*) = 3
)

Although this is ugly in its own way, of course. You may want to skip the temp table and just say 'category_id IN (...)' in the subquery.

araqnid