views:

38

answers:

2

I have this table: id,bookmarkID,tagID I want to fetch the top N bookmarkIDs for a given list of tags. Does anyone know a very fast solution for this? the table is quite large(12 million records) I am using MySql

A: 

It really depends on how the relational tag to bookmark data is structured. Ideally, each tag is mapped to one or more bookmarks which is essentially a huge reverse index of tags to bookmarks. If that is the case, you can fetch all rows that map tag to bookmark and from that apply a basic scoring function accross your results.

You could probably base it on the lucene scoring algorithm that includes the use/spread of the tag across the entire corpus, the density of tags for a given bookmark and some sort of normalizing factor based on when it was bookmarked.

Nick Gerakines
A: 

I mainly operate in MSSQL but I think something along the lines of this should work out for you:

SELECT bookmarkID
FROM myTable
WHERE tagID in ('tag1,tag2,tag3')
ORDER BY bookmarkID ASC
LIMIT 0,n

I could be wrong though, please let me know :)

Yoda