views:

388

answers:

3

Hello,

Running the below select on a MySql table containing 1500000 rows will take approximately 5 mins 30 seconds.

SELECT * FROM my_table WHERE timestamp BETWEEN UNIX_TIMESTAMP('2008-04-23 01:37:02') AND  UNIX_TIMESTAMP('2008-04-23 01:37:03')

[Executed: 25/01/10 5:32:47 EST PM ] [Execution: 231094/ms] 

Converting and replacing the values returned by UNIX_TIMESTAMP function in the above query will dramatically reduce the duration :

SELECT UNIX_TIMESTAMP('2008-04-23 01:37:02'),  UNIX_TIMESTAMP('2008-04-23 01:37:03')

UNIX_TIMESTAMP('2008-04-23 01:37:02')     UNIX_TIMESTAMP('2008-04-23 01:37:03')    
----------------------------------------  ---------------------------------------- 
1208911022                                1208911023                               


SELECT * FROM my_table WHERE timestamp BETWEEN 1208911022 AND 1208911023

[Executed: 25/01/10 5:58:27 EST PM ] [Execution: 11875/ms] 

The type of the timestamp column is INT(11).

We are not discussing indexing here - I am not the owner of the database but I will ask for an index on that column.

I want to ask you why the huge duration diff between the two queries ?

It seems that every INT(11) value from timestamp column is converted to the type of the value returned by UNIX_TIMESTAMP !


UPDATE 1


MySql version :

SELECT VERSION()

5.1.23-rc-log

Explain results :

EXPLAIN SELECT * FROM my_table WHERE timestamp BETWEEN UNIX_TIMESTAMP('2008-04-23 01:37:02') AND  UNIX_TIMESTAMP('2008-04-23 01:37:03')

 id     select_type     table          type     possible_keys     key     key_len     ref     rows      Extra       
 -----  --------------  -------------  -------  ----------------  ------  ----------  ------  --------  ----------- 
 1      SIMPLE          my_table       ALL      (null)            (null)  (null)      (null)  15046061  Using where 

EXPLAIN SELECT * FROM my_table WHERE timestamp BETWEEN 1208911022 AND 1208911023

 id     select_type     table          type     possible_keys     key     key_len     ref     rows      Extra       
 -----  --------------  -------------  -------  ----------------  ------  ----------  ------  --------  ----------- 
 1      SIMPLE          my_table       ALL      (null)            (null)  (null)      (null)  15046061  Using where 



UPDATE 2


SELECT * FROM my_table WHERE timestamp >= UNIX_TIMESTAMP('2008-04-23 01:37:02') AND timestamp <= UNIX_TIMESTAMP('2008-04-23 01:37:03')

 [Executed: 26/01/10 10:29:52 EST AM ] [Execution: 264172/ms] 

EXPLAIN SELECT * FROM my_table WHERE timestamp >= UNIX_TIMESTAMP('2008-04-23 01:37:02') AND timestamp <= UNIX_TIMESTAMP('2008-04-23 01:37:03')

 id     select_type     table          type     possible_keys     key     key_len     ref     rows      Extra       
 -----  --------------  -------------  -------  ----------------  ------  ----------  ------  --------  ----------- 
 1      SIMPLE          my_table       ALL      (null)            (null)  (null)      (null)  15046061  Using where 

Seems that >= and <= is not making any difference - runtime is over 5 mins!

A: 

I'm no mySQL guru, but it looks like mySQL is not optimizing the BETWEEN part of the statement, but executing it anew for every row, or not using an index set for the column. (Which I find really odd, seeing as the result of the UNIX_TIMESTAMP operation is fixed, but I don't have another explanation.)

Can you try using >= and <= instead of BETWEEN and see whether that changes the times any?

Pekka
I will try your suggestion tomorrow. But how do you explain that when I put the fixed values it works faster ?
Adrian S.
@Adrian S.: Please run the `EXPLAIN` statements I posted in a comment to your question. The output of those will tell the full story.
Asaph
A: 

as it doesn't seem to be an index or "between" issue, perhaps the UNIX_TIMESTAMP function is being evaluated for comparison with each row. that is, its not considering the result to be a constant. if that's the case you can calculate the overhead of running the UNIX_TIMESTAMP function 1.5 million times:)

Don Dickinson
+2  A: 

I ran these two queries using MySQL's BENCHMARK() function:

mysql> SELECT BENCHMARK(15000000, 1208911022 BETWEEN 
UNIX_TIMESTAMP('2008-04-23 01:37:02') AND  UNIX_TIMESTAMP('2008-04-23 01:37:03'));
1 row in set (33.28 sec)

mysql> SELECT BENCHMARK(15000000, 1208911022 BETWEEN 1208911022 AND 1208911023);
1 row in set (0.52 sec)

It appears that MySQL isn't smart enough to factor out UNIX_TIMESTAMP() expressions, even though they should be constant. MySQL evaluates the functions during each iteration of the expression. So using this function was about 64 times slower in this test.

I'm running MySQL 5.1.41 on a Macbook 2.4GHz Intel Core 2 Duo.

I suggest that you convert the timestamps to their integer values before preparing the query.

Bill Karwin