views:

49

answers:

1

I have a table of data that is searchable and sortable, but likely to produce hundreds or thousands of results for broad searches. Assuming the user searches for "foo" and sorts the foos in descending price order I'd like to show a quick-jump select menu like so:

<option value="1">Page 1 ($25,000,000 - $1,625,000)</option>
<option value="2">Page 2 ($1,600,000 - $1,095,000)</option>
<option value="3">Page 3 ($1,095,000 - $815,000)</option>
<option value="4">Page 4 ($799,900 - $699,000)</option>
...

Is there an efficient way of querying for this information directly from the DB? I've been grabbing all of the matching records and using PHP to calculate the min and max value for each page which seems inefficient and likely to cause scaling problems.

The only possible technique I've been able to come up with is some way of having a calculated variable that increments every X records (X records to a page), grouping by that, and selecting MIN/MAX for each page grouping... unfortunately I haven't been able to come up with a way to generate that variable.

+2  A: 

While technically possible, I'd advise you not to waste time with this feature, and look for another approach.

First, some background information on pagination. Pagination is typically accomplished using ORDER BY ... LIMIT [offset], [rows]. In order for MySQL to provide you the rows at a given offset, it must read all the prior rows then throw them away. This is expensive and the cost increases with the offset increasing. (e.g. LIMIT 1000, 20 must read 1020 rows and then throw 1000 away). This is even worse when the ORDER BY cannot be accomplished using an index. This can result in the entire unlimited resultset being read, then filesorted, and then the first 1000 rows are thrown away to satisfy a LIMIT 1000, 20.

Now, let's address your specific problem. As shown, pagination isn't anything "magical"; it's a rather brute force way of paginating resultsets. Specifically, you don't know what page something falls on ahead of time without doing a sort on the entire resultset. To figure this the min and max per page you'd need to use temporary tables, and then calculate the per page min/max for your entire resultset from this temp table. If your data is at all volatile, then you'd need to recalculate this as often as the underlying data changes.

If the storage engine somehow stored what "page" each row was on for a given ordering, then you could accomplish this relatively easily. Unfortunately that is not the case.

Additionally, I'd question the usefulness of your approach to your users. Your example shows a nice distribution of prices. Can you be sure that you wouldn't end up with ranges like the following?

<option value="1">Page 1 ($1,005,000 - $1,004,000)</option>
<option value="2">Page 2 ($1,004,000 - $1,003,450)</option>
<option value="3">Page 3 ($1,003,450 - $1,003,387)</option>
<option value="4">Page 4 ($1,003,387 - $1,003,342)</option>

I'd question whether this would be any benefit to the user.

Suggestion

  • Keep your pagination simple
  • If you want users to be able to select a range of prices, then build that in as a filter - don't combine it with pagination.

Example

Search for something on Amazon and notice their results page. You'll get your search results simply paginated. On the left you will have a variety of filters to apply to your search to further refine it.

hobodave
Yeah that's essentially the conclusion I was coming to as well... just didn't know if I was missing something clever that would make it work nicely. FWIW I'm familiar with LIMIT/OFFSET in queries and the only reason it isn't currently being used is because we had to look at every row in the result in php to build the min/max per page anyway. Brings up another question though... if MySQL has to look at the first X matches anyway to offset them, is it that much worse to return everything every time and step through the set in PHP?
Ty W
@Ty: Yes. Anything PHP does, MySQL (C) does 10-100x faster.
hobodave
@Ty: Pagination optimization resource: http://www.percona.com/ppc2009/PPC2009_mysql_pagination.pdf
hobodave
thanks for the link, good info there
Ty W