views:

842

answers:

7

I'm implementing a tagging system for a website. There are multiple tags per object and multiple objects per tag. This is accomplished by maintaining a table with two values per record, one for the ids of the object and the tag.

I'm looking to write a query to find the objects that match a given set of tags. Suppose I had the following data (in [object] -> [tags]* format)

apple -> fruit red food
banana -> fruit yellow food
cheese -> yellow food
firetruck -> vehicle red

If I want to match (red), I should get apple and firetruck. If I want to match (fruit, food) I should get (apple, banana).

How do I write a SQL query do do what I want?

@Jeremy Ruten,

Thanks for your answer. The notation used was used to give some sample data - my database does have a table with 1 object id and 1 tag per record.

Second, my problem is that I need to get all objects that match all tags. Substituting your OR for an AND like so:

SELECT object WHERE tag = 'fruit' AND tag = 'food';

Yields no results when run.

A: 

I'd suggest making your table have 1 tag per record, like this:

 apple -> fruit
 apple -> red
 apple -> food
 banana -> fruit
 banana -> yellow
 banana -> food

Then you could just

 SELECT object WHERE tag = 'fruit' OR tag = 'food';

If you really want to do it your way though, you could do it like this:

 SELECT object WHERE tag LIKE 'red' OR tag LIKE '% red' OR tag LIKE 'red %' OR tag LIKE '% red %';
yjerem
A: 

@Kyle: Your query should be more like:

SELECT object WHERE tag IN ('fruit', 'food');

Your query was looking for rows where the tag was both fruit AND food, which is impossible seeing as the field can only have one value, not both at the same time.

Steve M
A: 

@Steve M

Doesn't IN act as an OR in that instance? If so, it's not what I'm looking for. Since cheese has the tag food it would be returned as well, but since it's not also tagged fruit it's not supposed to be.

Kyle Cronin
+2  A: 

Given:

  • object table (primary key id)
  • objecttags table (foreign keys objectId, tagid)
  • tags table (primary key id)

    SELECT distinct o.* from object o join objecttags ot on o.Id = ot.objectid join tags t on ot.tagid = t.id where t.Name = 'fruit' or t.name = 'food';

This seems backwards, since you want and, but the issue is, 2 tags aren't on the same row, and therefore, an and yields nothing, since 1 single row cannot be both a fruit and a food. This query will yield duplicates usually, because you will get 1 row of each object, per tag.

If you wish to really do an and in this case, you will need a group by, and a having count = number of ors in your query for example.

SELECT distinct o.name, count(*) as count
from object o join objecttags ot on o.Id = ot.objectid
join tags t on ot.tagid = t.id
where t.Name = 'fruit' or t.name = 'food'
group by o.name
having count = 2;
DevelopingChris
+2  A: 

Oh gosh I may have mis-interpreted your original comment.

The easiest way to do this in SQL would be to have three tables:

1) Tags ( tag_id, name )
2) Objects (whatever that is)
3) Object_Tag( tag_id, object_id )

Then you can ask virtually any question you want of the data quickly, easily, and efficiently (provided you index appropriately). If you want to get fancy, you can allow multi-word tags, too (there's an elegant way, and a less elegant way, I can think of).

I assume that's what you've got, so this SQL below will work:

The literal way:

    SELECT obj 
      FROM object
     WHERE EXISTS( SELECT * 
                     FROM tags 
                    WHERE tag = 'fruit' 
                      AND oid = object_id ) 
       AND EXISTS( SELECT * 
                     FROM tags 
                    WHERE tag = 'Apple'
                      AND oid = object_id )

There are also other ways you can do it, such as:

SELECT oid
  FROM tags
 WHERE tag = 'Apple'
INTERSECT
SELECT oid
  FROM tags
 WHERE tag = 'Fruit'
Matt Rogish
A: 

Combine Steve M.'s suggestion with Jeremy's you'll get a single record with what you are looking for:

select object
from tblTags
where tag = @firstMatch
and (
       @secondMatch is null 
       or 
       (object in (select object from tblTags where tag = @secondMatch)
     )

Now, that doesn't scale very well but it will get what you are looking for. I think there is a better way to go about doing this so you can easily have N number of matching items without a great deal of impact to the code but it currently escapes me.

Rob Allen
A: 

I recommend the following schema.

Objects: objectID, objectName
Tags: tagID, tagName
ObjectTag: objectID,tagID

With the following query.

select distinct
    objectName
from
    ObjectTab ot
    join object o
        on o.objectID = ot.objectID
    join tabs t
        on t.tagID = ot.tagID
where
    tagName in ('red','fruit')
jms