views:

70

answers:

2

Hi,

Situation:

In my database I have a table called 'artists' and 'tags'.

Each artist has a set of tags, saved in a linking table 'artisttags'.

Each unique tag is saved in a table called 'tags'.

Problem

I would like to show all artists that have one (or more) tags in common with a given artist.

function getSimilarArtists($artist_id)
{
   $sql = "SELECT ...";
   $query = mysql_query($sql);
   while($artist = mysql_fetch_assoc($query))
   {
       $html .= "<li>".$artist['name']."</li>"    
   }
   print($html);
}

Tables

artists

id | name

artisttags

id | artist_id | tag_id    

tags

id | tag_name

Any help is welcome.

Thanks

+2  A: 
SELECT DISTINCT a.name FROM artisttags at
LEFT JOIN artisttags at2
ON at2.tag_id = at.tag_id
LEFT JOIN artists a
ON at2.artist_id = a.id
WHERE at.id = '$artist_id'
milosz
thanks! little mistake though: at.id = '$artist_id' should be at.artist_id = '$artist_id'
Bundy
As symcbean said, inner joins would be more efficient, so you'd better use his answer ;)
milosz
Add lineGROUP BY similliar.nameafter ORDER BY count(*) DESC
milosz
OK and where should I providen the 'given artist id' ?
Bundy
Add to WHERE clause: AND current.artist_id = '$artist_id'
milosz
+3  A: 

Those outer joins in Mitosz's reply are really going to hurt - and will return every artist - not just those with "one (or more) tags in common". Use Inner Joins instead

SELECT similar.name, count(*) as commontags
FROM artists current, 
  artisttags curtags,
  artisttags simtags,
  artists similar
WHERE current.id=curtags.artist_id
  AND curtags.tag_id=simtags.tag_id
  AND simtags.artist_id=similar.id
ORDER BY count(*) DESC;

Of course, for a smarter indexing system you could apply scoring to each tag in the tags table (e.g. based on user votes or cardinality) and sort your results by SUM(tag.score).

HTH

C.

symcbean
I get this error: Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with no GROUP columns is illegal if there is no GROUP BY clause
Bundy
and where should I provide the 'given artist id'?
Bundy
Add to the query GROUP BY similar.id, similar.name. Also add AND current.id=$currentArtist (with appropriate security checks).
SorcyCat
Thanks SorcyCat - that's right (slight brain fade on my part)
symcbean