views:

98

answers:

4
SELECT count(*) c FROM full_view WHERE verified > ( DATE (NOW()) - INTERVAL 30 DAY)

If I run that query it takes a split second but if I switch the comparison operator around it takes eons. Now the first way the count = 0 and the second way the count = 120000, but if I just count the whole table that also takes microseconds.

But there is something something funky going on because if the query ever does finish it runs super quick thereafter. MySQL is caching the query or something right? Well, I don't want to depend on caches to make sure the website doesn't hang.

This seems nonsensical: if it can count everything greater than a certain date quickly, why should it take any longer to count the opposite? Either way it has to look through the whole table right? And all it needs to return is a number so bandwidth shouldn't be an issue.

Explain on the query:

1, 'SIMPLE', 'b', 'range', 'updated,verified_index', 'updated', '3', '', 28, 'Using where'`    
1, 'SIMPLE', 'l', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'xyz_main.b.loc_id', 1, 'Using index'
1, 'SIMPLE', 'f', 'ALL', '', '', '', '', 2214, ''

EDIT:

This may be of some interest, I found this info when I run the query:

Handler_read_rnd_next:

  • 254436689 (when doing less than)
  • 2 (for greater than)

Key_read_requests: 314393 vs 33 (33 is the biggest number for all of the stats when using greater than)

Handler_read_key: 104303 vs 1

Bypassing the view and running the query directly on the main table eliminates the slowness. So what do I need to do to speed it up? The view is essentially like this:

SELECT x, y, z, verified FROM table1 LEFT JOIN table2 on tab2_ID = table2.ID LEFT JOIN table3 on tab3_ID = table3.ID

SOLVED: Frankie led my in the right direction. The second joined table (the company table) was joined via the full text name of the companies. I only recently decided to add a integer key to that table. The name column was supposed to be indexed but I may have botched that. Anyway I re-organized everything. I converted the foreign key in the main table to match the integer ID of the company table rather than the full company name. I re-indexed those columns in each table, then I updated the view to reflect the new join point. Now it runs instantly in both directions. :) So I guess integer keys were the key. The problem is gone but still, I don't feel like my original question was really solved.

Thanks for your help guys.

+2  A: 

My guess is that the subtraction from the Date(Now()) is what's taking a long time to process. For values of verified that are already less than Date(Now()), the evaluation can be short-circuited because at that point it MUST be false (when comparing "greater than").

In the situation where you're comparing with "less than," the date time must be subtracted in every case, regardless of current value, since it has no way of logically concluding the expression to be true or false prior to evaluating the datetime subtraction

That's just a guess, though - take it with a grain of salt.

Dave McClelland
Using an explicit date also takes forever.
Moss
@Moss Interesting, maybe my idea is off-base then. How about just reversing the order of the comparison? Instead of `A < B`, do `B > A` and see if it takes as long - it should execute exactly the same (I would guess)
Dave McClelland
Interesting thought, but too late to try it now. I solved the problem with betters keys/indexing.
Moss
A: 

If you've an index on verified in the table then the more restrictive COUNT (the > one) will be quicker. The COUNT(*) without a WHERE clause can return quickly as the count can be gleaned from table/index statistics.

Will A
Putting the comparison inside the count is nicer syntax but it doesn't speed it up. And I do have an index on `verified`.
Moss
In fact this method makes it slow in both directions.
Moss
+1  A: 

It may be the case that there are statistics telling the database engine that there are no records for verified > 30 days ago. In that case, it doesn't even have to read the table at all, but rather get info from the stats histogram.

Strommy
+2  A: 

Please run the bellow query and post the results.

EXPLAIN SELECT count(*) c 
FROM full_view 
WHERE verified > ( DATE (NOW()) - INTERVAL 30 DAY)

The long forgotten EXPLAIN almost always brings something in! ;)


Edit 1:
This is probably the offensive line:

1, 'SIMPLE', 'f', 'ALL', '', '', '', '', 2214, ''

The ALL there states that there is a FULL table scan.

You can dig further into the Explain syntax on this diagram.

Do try to see where the differences go...


Edit 2:
This doc will sure make things much clearer on the Explain output. Please check it out.


Edit 3:
Step-by-step analys of the explain command.

1, 'SIMPLE', 'b', 'range', 'updated,verified_index', 'updated', '3', '', 28, 'Using where'`    
1 - id
SIMPLE - simple select, not using sub-queries
b - table name
range - only rows that are in a given range are retrieved, using an index
updated,verified_index - are both possible keys
updated - was the key eventually used
3 - key lenght
'' - this is the ref column and would show which columns or constants are compared to the index name in the key column to select rows from the table.
28 - number of rows mysql believes it must examine to execute the query
Using where - self explanatory
Frankie