views:

188

answers:

3

Hi all,

I've got a web application that matches images to tags, and I need to create a way of dynamically refine results for tag search. However, I cannot find a clean way to make that SQL queries, and that's where I need your help.

The idea is that if I search for tags "clean" and "dog", I will have image results that have both the tags "clean" and "dog". If I also include the tag "little", my results would have to narrow down to images that have the three tags associated.

So, having an N-to-N relation, which is the correct way to do this?

My natural approach was generating code something like this, but I certainly don't like where it is going:

SELECT images.*
FROM images
INNER JOIN image_tags ON ...
INNER JOIN tags ON ...
WHERE tags.tag = @tag1
AND EXISTS
(
  SELECT 1
  FROM images 
  INNER JOIN image_tags ON ...
  INNER JOIN tags ON ...
  WHERE tag = @tag2
  AND EXISTS
  (
    SELECT 1
    FROM images 
    INNER JOIN image_tags ON ...
    INNER JOIN tags ON ...
    WHERE tag = @tag3
    AND EXISTS (...)
    ...
  )
)

Certainly, that's not really good. Any idea?

Thanks!

+4  A: 

Something like that could work (I use id for SELECT and GROUP BY, use the columns you need.

SELECT images.id
FROM images
INNER JOIN image_tags ON ...
INNER JOIN tags ON ...
WHERE tags.tag IN ( @tag1, @tag2, @tag3 )
GROUP BY images.id
HAVING COUNT(*) = @number_of_tags

If you have 3 tags like in your example then number_of_tags would have to be 3, and the join would result in 3 rows per id that matches.

You can either create that query dynamically, or define it with, say, 10 tags and initialize them with a value that won't occur in tags.

Peter Lang
This is pretty rigid in the number of tags allowed/required, as well as returning a row for each tag specified, rather than for each image.
md5sum
The `GROUP BY` should avoid returning a row for each tag. I edited the question to show how it would work with a dynamic number of tags.
Peter Lang
Thanks a lot! I haven't thought of re-checking the results with HAVING COUNT().
Alpha
A: 

I would not use an N-N relation but a text field to store the tags.

This may sound dirty as we are loosing normality but tags are generally used for text search only anyway and disk space is cheap.

You can then run

SELECT * FROM images WHERE tags LIKE '%clean%' AND tags LIKE '%dog%'...
Benoit Vidis
@Peter - Nice alphabetizing... d comes AFTER c... ;-)
md5sum
Note: Your solution would make counting images per tag and renaming or deleting tags more complicated.
Peter Lang
@md5sum: OMG! Sorry, I had to delete that post ;-)
Peter Lang
You could order you tags alphabetically when inserting. This would allow to do the SELECT with one LIKE: `SELECT * FROM images WHERE tags LIKE '%clean%dog%'`
Peter Lang
hah, that was just LoL funny... I had to say something.
md5sum
@md5sum: Really appreciated, made me laugh too when I read your post ;-)
Peter Lang
A: 

Using intersect you could do this:

SELECT images.* 
FROM images 
WHERE image_id IN 
  (
    SELECT image_id FROM image_tags WHERE tag_id =
      (SELECT tag_id FROM tags WHERE tag = @tag1)
    INTERSECT
    SELECT image_id FROM image_tags WHERE tag_id =
      (SELECT tag_id FROM tags WHERE tag = @tag2)
    INTERSECT
      ....
   )

This will select all images based on the intersection (matching all) tags in image_tags.

rosscj2533