views:

89

answers:

5

I have a two tables one for documents one for mapping to categories.

Documents  
id | document_name  
1  | somename.doc  
2  | anothername.doc  

Documents_to_categories  
cat_id | doc_id  
10     | 1       
10     | 2  
11     | 3  
12     | 1

Some documents can map to multiple categories. What I want is to be able to select documents that belong in multiple categories, like in a filtering scheme.

Basically in the script I have an array of document id's that were a result from a search. I need to filter those documents down based on categories.

This is what something like what I'm aiming for (I know it doesn't work but for example).

SELECT * 
 FROM Documents_to_categories A 
 JOIN Documents B ON A.doc_id = B.id 
WHERE B.id IN (6703,6614,2286) 
  AND A.cat_id = :ID0 
  AND A.cat_id = :ID1

Edit: Sorry for all those who answered, my first time posting, was unclear with question. Hopefully this is more clear on what I want.

A: 

To get the documents that ONLY appear in BOTH categories, you could use the ANSI SQL INTERSECT operator as follows:

SELECT Documents.document_name  
FROM Documents 
WHERE Documents.id IN
(
    SELECT Documents_to_categories.docid 
    FROM Documents_to_categories  
    WHERE Documents_to_categories.catid = 10
  INTERSECT
    SELECT Documents_to_categories.docid 
    FROM Documents_to_categories  
    WHERE Documents_to_categories.catid = 12
/*.. add additional filters */
)
doza
A: 

I want to filter by category 10 and 12, so only documents belonging to 10 and 12 return (in this case doc 1)

You need to generate SQL query from your app:

SELECT
  d.id,
  d.document_name
FROM
  Documents as d,
  Documents_to_categories dc1,
  Documents_to_categories dc2,
WHERE
  d.id = dc1.doc_id
  AND d.id = dc2.doc_id
  AND dc2.cat_id = 10
  AND dc1.cat_id = 12

This should work on all major database servers.

instead of selecting from a documents table, I want to select it from an array of document ID's.

Not sure what you mean. There is no array an entity in relational world (eg SQL). There is one which is called table or maybe set.

Dmytrii Nagirniak
isn't this 10 or 12, he wants 10 and 12.
RedDeckWins
The `IN` clause is a synonym for `OR` - your query currently will return documents with cat_id 10, 12, and both.
OMG Ponies
Yes. I used OR. Updated the question to use AND. Missed that small thing :)
Dmytrii Nagirniak
+1  A: 

Assuming you're passing in the list of document id's as a comma separated list, the simplest solution would be to use dynamic SQL:

L_CURSOR SYS_REFCURSOR;
L_QUERY  VARCHAR2(5000) DEFAULT 'SELECT d.document_name
                                   FROM DOCUMENTS d
                                   JOIN DOCUMENTS_TO_CATEGORIES c10 ON c10.doc_id = d.id
                                                                   AND c10.cat_id = 10
                                   JOIN DOCUMENTS_TO_CATEGORIES c12 ON c12.doc_id = d.id
                                                                   AND c12.cat_id = 12
                                  WHERE d.id IN (:document_list)'

BEGIN

  FOR I IN 0 .. (TRUNC(LENGTH(L_QUERY) / 255)) LOOP
    DBMS_OUTPUT.PUT_LINE(SUBSTR(L_QUERY, I * 255 + 1, 255));
  END LOOP;

  OPEN L_CURSOR FOR L_QUERY USING IN_DOCUMENT_LIST;
  RETURN L_CURSOR;

END;

What I want is to be able to select documents that belong in multiple categories, like in a filtering scheme. Ex: I want to filter by category 10 and 12, so only documents belonging to 10 and 12 return (in this case doc 1).

Using HAVING:

  SELECT d.document_name
    FROM DOCUMENTS d
    JOIN DOCUMENTS_TO_CATEGORIES dtc ON dtc.doc_id = d.id
   WHERE dtc.cat_id IN (10, 12)
GROUP BY d.document_name
  HAVING COUNT(DISTINCT dtc.cat_id) = 2

Using JOINs

SELECT d.document_name
  FROM DOCUMENTS d
  JOIN DOCUMENTS_TO_CATEGORIES c10 ON c10.doc_id = d.id
                                  AND c10.cat_id = 10
  JOIN DOCUMENTS_TO_CATEGORIES c12 ON c12.doc_id = d.id
                                  AND c12.cat_id = 12
OMG Ponies
Sorry I was unclear the first time, I've provided a (nonworking) example which hopefull will give you an idea of what I'm trying to achieve.
rochan
A: 

This gives you all documents that belong in at least one category, of which at least one category is either 10 or 12.

SELECT
  doc_id
FROM Documents_to_categories
WHERE doc_id IN (<your list of doc_ids>)
GROUP BY doc_id
HAVING COUNT(*) > 1
AND MAX(CASE WHEN cat_id IN (10, 12) THEN 1 ELSE 0 END) = 1;
lins314159
A: 
select a.doc_id, count(*)
from Documents_to_categories A 
where a.doc_id in (<doc_id list>)
and a.cat_id in (<cat_id list>)
group by doc_id
having count(*) = <cat_id list length>

will return a list of doc_ids that have an entry for every category in the cat_id list. Note that you also need to supply the length of the list for the having clause.

You can use this to retrieve all of the details required in an outer select, using the select above to populate an inlist for the doc_ids.

This ends up looking like :

select b.id, b.document_name, a,cat_id
from Documents_to_categories A,
Documents B
where a.doc_id = b.id
and b.id in (select mylist.doc_id from (
    select a.doc_id, count(*)
    from Documents_to_categories A 
    where a.doc_id in (<doc_id list>)
    and a.cat_id in (<cat_id list>)
    group by doc_id
    having count(*) = <cat_id list length>

) as mylist )
Steve De Caux
Hey, this is the solution I ended up using, simple and database agnostic. Thanks!
rochan
Feel free to upvote it as well !
Steve De Caux