views:

336

answers:

3

I have a dataset in MySQL where using limit is already an expensive query, and finding the number of results is expensive as well. Thus, I'd like to avoid doing another query to find the number of results. I cannot use MYSQL_CALC_FOUND_ROWS because the limit is inside a subquery:

SELECT * FROM items,
(
  SELECT
    item_id
  FROM
    (etc)
  WHERE
    some.field=<parameter>
  AND (etc)
  GROUP BY (something)
  ORDER BY (something_else) DESC
  LIMIT 15
) subset
WHERE item.id=subset.item_id

I could left join items and do away with the subquery, then be able to use MYSQL_CALC_FOUND_ROWS, however this is very, very, slow. I've tried all index optimizations and let's just assume it is out of the question.

This now becomes more a design question... how do I allow the user to page through this data when I don't know the last page? I only know if they've gone too far (eg: query returns no results).

+2  A: 

Here's a summary of an article by MySQL guru Baron Schwartz:

http://www.mysqlperformanceblog.com/2008/09/24/four-ways-to-optimize-paginated-displays/

Four Ways to Optimize Paginated Displays

  1. On the first query, fetch and cache all the results.

  2. Don't show all results. Not even Google lets you see the millionth result.

  3. Don't show the total count or the intermediate links to other pages. Show only the "next" link.

  4. Estimate how many results there are. Again, Google does this and nobody complains.

Bill Karwin
A: 

To reinforce that, the concept of "how many" is extremely transitory anyway; by the time the answer gets to the user, it could easily be wrong.

le dorfier
Very good point.. although in this case results are not likely to change. I've decided to compute "how many" and cache it (since it's a nice trivial piece of data it seems), but to not show results passed 500.
Sam
A: 

In reply to Bill Karwin:

I've spawned a new question on whether storing the cached results in a MySQL table, which would be a table with 500,000,000 rows (500 results per item, 1,000,000 items), is feasible. It can be found here.

Sam
this should be a comment on bill karwins answer because these two posts will most probably be spacially separated.
tharkun