I'm working on a problem that requires caching paginated "search" results: http://stackoverflow.com/questions/347277/paginating-very-large-datasets
The search works as follows: given an item_id, I find the matching item_ids and their rank.
I'm willing to concede not showing my users any results past, say, 500. After 500, I'm going to assume they're not going to find what they're looking for... the results are sorted in order of match anyway. So I want to cache these 500 results so I only have to do the heavy lifting of the query once, and users can still page the results (up to 500).
Now, suppose I use an intermediate MySQL table as my cache... that is, I store the top 500 results for each item in a "matches" table, like so: "item_id (INTEGER), matched_item_id (INTEGER), match_rank (REAL)". The search now becomes the extremely fast:
SELECT item.* FROM item, matches
WHERE matches.item_id=<item in question>
AND item.id=matches.matched_item_id
ORDER BY match_rank DESC
LIMIT x,y
I'd have no problem reindexing items and their matches into this table as they are requested by clients if the results are older than, say, 24 hours. Problem is, storing 500 results for N items (where N is ~100,000 to 1,000,000) this table becomes rather large... 50,000,000 - 500,000,000 rows.
Can MySQL handle this? What should I look out for?