views:

42

answers:

1

I've got a dataset with over 100k rows, so it's not tiny, but not huge either. When paging through the results, it gets progressively slower as you go to higher pages. In other words, this query:

SELECT * FROM items WHERE public = 1 ORDER BY name LIMIT 0,10

executes much faster than

SELECT * FROM items WHERE public = 1 ORDER BY name LIMIT 10000,10

I have an index on name, and I used to have an index on public, but I removed it since it seemed to degrade performance even more.

Any ideas here? Is there an easy way to speed this up? I'm considering removing the ability to view the higher pages since nobody really browses past page 2 or 3, except robots, and there are easier ways for them to find that content.

+2  A: 

The large LIMIT problem :

Beware of large LIMIT Using index to sort is efficient if you need first few rows, even if some extra filtering takes place so you need to scan more rows by index then requested by LIMIT. However if you're dealing with LIMIT query with large offset efficiency will suffer. LIMIT 1000,10 is likely to be way slower than LIMIT 0,10. It is true most users will not go further than 10 page in results, however Search Engine Bots may very well do so. I've seen bots looking at 200+ page in my projects. Also for many web sites failing to take care of this provides very easy task to launch a DOS attack - request page with some large number from few connections and it is enough. If you do not do anything else make sure you block requests with too large page numbers.

For some cases, for example if results are static it may make sense to precompute results so you can query them for positions. So instead of query with LIMIT 1000,10 you will have WHERE position between 1000 and 1009 which has same efficiency for any position (as long as it is indexed)


Resources :

Colin Hebert
Excellent answer. Thanks so much.
Micah