tags:

views:

52

answers:

2

I have a table with over 1 million entries.

The problem is with the speed of the SELECT queries. This one is very fast:

SELECT * 
  FROM tmp_pages_data 
 WHERE site_id = 14294

Showing rows 0 - 29 (1,273,042 total, Query took 0.0009 sec)

And this one is very slow:

SELECT * 
  FROM tmp_pages_data 
 WHERE page_status = 0

Showing rows 0 - 29 (15,394 total, Query took 0.3018 sec)

There is an index on the id column only, not needed in any of the selects. So there is no index on site_id or page status.

The 0.30 seconds query is very disturbing, especially when there are thousands requests.

So how can this be possible? What can I do to see what's slowing it down?

+3  A: 

What can I do to see what's slowing it down?

It's quite obvious what is slowing it down - as you've already pointed out you don't have an index on the page_status column, and you should have one.

The only surprise is that your first query is so fast without the index. Looking at it more closely it seems that whatever client you are running these queries on is adding an implicit LIMIT 30 that you aren't showing in your question. Because there are so many rows that match it doesn't take long to find the first 30 of them, at which point it can stop searching and return the result. However your second query returns fewer matching rows so it takes longer to find them. Adding the index would solve this problem and make your query almost instant.

Short answer: add an index on the column page_status.

Mark Byers
No, actually I can't add an index, since I use a lot of UPDATES (over 100 per second), and this would only slow down the whole script.
okaybmd
And I have tried other queries and they are fast as well (e.g. SELECT * FROM tmp_pages_data WHERE page_type = 'subcat_list_level_1' - Showing rows 0 - 29 (565 total, Query took 0.0009 sec) - again, there is no index on page_type
okaybmd
@okaybmd: Then you might want to consider having an offline system for querying which you synchronize once per day so that you don't load your main system. Either that, or live with the slow query times.
Mark Byers
@okaybmd: Now try: `SELECT * FROM tmp_pages_data WHERE page_type = 'subcat_list_level_10000'` I bet that is not so fast...
Mark Byers
Ok, you're right. Your query is very slow: MySQL returned an empty result set (i.e. zero rows). ( Query took 0.8105 sec ). But how come? There are no indexes on page_type!
okaybmd
@okaybmd: An analogy: You are in a room filled with millions of documents, 10 of which are about FooBar Co. They are not sorted in any particular order, so the FooBar Co documents may or may not be near each other. If I told you to find me any 10 documents which are **not** about FooBar Co, how long do you think it would take you? Now how long would it take you if I asked for any 10 documents that **are** about FooBar Co? The second request might take you a little longer, right? If the documents were sorted in alphabetical order by company, would that help you at all?
Mark Byers
Ok, but addding an index on 'page_status' is not an option. I'm using a script that it's doing lots of UPDATES (hundreds per second), and adding an index would slow it down at least 100 times.
okaybmd
Also, tests are from phpMyAdmin, so that's why it says "Showing rows 0 - 29", my actual script has LIMIT 1000
okaybmd
+1  A: 

Ok, from our discussion in the comments we now know that the db somehow knows that the first query will returns all rows. That's why it's so fast.

The second query is slow because it doesn't have an index. OMG Ponies already stated that a normal index won't work because the value set is too small. I'd just like to point you to 'bitmap indexes'. I've not used them myself yet but they are known to be designed for exactly this case.

Wolfgang
Ok, thanks very much. I have looked a bit into this and it looks like a great idea. However, MySQL doesn't support bitmap indexes. So now I have to consider switching to PostgreSQL...
okaybmd
After more research, PostgreSQL is actually slower than MySQL, so I don't think that's an option either
okaybmd