ansaurus

Question

Answer 1

+1 A:

select 
   tagid 
from 
   taggings 
where 
   trackid in ([list of, or subquery for your selected tracks])

Disable all the tags, unless the result contains their id. Or disable all tags, then re-enable those that are returned by this query. You could also do some restructuring and convert it to a 'not in' query, but that would normally be slower.

krdluzni 2009-09-03 16:19:18

I think the point is to have the controls enabled and allowing the user to click them.

scottm 2009-09-03 16:24:20

Well, yes, the user should be able to click the tag-controls that "make sense". Whether the more efficient thing is to disable all and then enable some, or to enable all and then disable some, I don't know. Depends on what the most efficient query is; find tags are used together with the curent ones, or find those that are _not_ used with the current ones.

Flambino 2009-09-03 16:58:43

'Not In' queries are known to be inefficient since they usually require a full scan of the table, whereas 'In' only needs to search until the first hit. 'Not in' is easier to read and make sense of, but less efficient. How you handle the processing of that data after retrieval, I can't say without knowing more about your interface design.

krdluzni 2009-09-03 17:40:05

Thanks for the tip - I'll avoid the NOT IN condition. As for the processing of the data, that's pretty straight forward so long as I can get some tag-ids from the database in a good way.

Flambino 2009-09-03 18:19:13

Answer 2

A:

This probably isn't the most efficient:

SELECT 
    TagId --to disable
FROM Tags 
WHERE TagId NOT IN(
    SELECT Distinct TagId FROM Taggings WHERE TrackId IN (
     SELECT TrackId FROM Taggings WHERE TagId in (1, 2)
    ) 
)

Get all tracks matching currently selected tags.

Take all tags assigned to those tracks

Disable tags not in this list

Edit
What about this:

SELECT 
    DISTINCT TagId 
FROM 
    Taggings
GROUP BY 
    TagId,
    TrackId HAVING TrackId IN (
     SELECT 
      TrackId
     FROM 
      Taggings
     WHERE
      TagId in ( 1, 2)
    )

This should return all tags that should be enabled.

scottm 2009-09-03 16:32:27

Hmmm.. that crashed my MySQL-client :-)Trouble is, that I have thousands of tracks, and I need the interface to be really responsive. While your answer probably would work, it seems too brute force, so to speak.I'm thinking, that it should be possible to find the tag_ids that are used together, simply from the -- short -- list of selected tags, rather than a list of thousands of track_ids.

Flambino 2009-09-03 16:56:04

Yeah, I figured it was going to be a little rough. I don't think you'll be able to do it with just the tags. I do think you have to have the tracks to know what the remaining tag possibilities are.

scottm 2009-09-03 17:05:17

Yeah, you're probably right. But now I'm thinking of adding another table with the different tag combinations for a quicker lookup. I'll let this question sit here for a while though and see what happens. But thanks for the input so far!

Flambino 2009-09-03 18:14:54

@Flambino: That sparked off an idea... coming soon.

krdluzni 2009-09-03 19:08:46

@krdluzni: Looking forward to it@scottm: Thanks, that runs OK, but it's still a little slow. 1.63 seconds for 1000 tracks with 16 possible tags.To the both of you, thanks for your help, but there's a friend pointed out a nasty complication - see my comment up above by the original question.

Flambino 2009-09-04 00:31:53

Answer 3

A:

Do you have control over how tags are added to the tracks? If you can hook into that, you can create some extra metadata.

Essentially a many-to-many join between tags. Every time you add a tag, you add a record to this join table containing an old-tag/new-tag pair (with the lower id as the first value to reduce duplication), for every one of the old tags on that track. IIRC, there's a way to set up the table to ignore duplicate inserts, rather than throwing errors. You'll also have to manage this table during deletion of tags, which will be more time-consuming, but that's probably a rare event.

With the above in place you have a simple query to identify tags that are still relevant:

select 
   tag_a
from
   related_tags
where
   tag_b in ([tags_already_in_search])
union
select
   tag_b
from
   related_tags
where
   tag_a in([tags_already_in_search])

This solution essentially shifts some of the processing time to the point in time where tags are added and removed.

krdluzni 2009-09-03 19:19:27

Interesting! I'll try this out. And yes, I was thinking about creating a 2-column lookup table exactly like you describe, precisely because it'd move the server load to a different point in time, and because I have total control over when tags are added/removed/changed.

Flambino 2009-09-03 19:40:58

ansaurus

tags:

views:

answers:

How to find complementary tags?

related questions