views:

335

answers:

3

I have a table in a MySQL database from which I want to select the row with the closest timestamp to another given timestamp.

time is the timestamp column (an integer UNIX timestamp). I chose 1250710000 arbitrarily.

This is the query that I've come up with, and I'm wondering if there's a more efficient way to do it:

SELECT *, ABS(time - 1250710000) AS time_dist FROM table
 ORDER BY time_dist ASC LIMIT 1

Is this the best way to do it?

+1  A: 

As Evan said, the way you have it is fine. I would recommend an index on that timestamp field, so that MySQL can scan the smaller index rather than the whole table. Also, I would try some 'boxing' to see if the index can speed things up:

SELECT *, ABS(time - 1250710000) AS time_dist FROM table 
WHERE time between(1250610000,1250810000)
ORDER BY time_dist ASC LIMIT 1

The above limits to query to approximately +/- 1 day. You will have to do some benchmarks to see if the additional index scan (the where clause) is faster than computing ABS() on all entries in the table.

MadCoder
I dislike having arbitrary limits like that.
too much php
+1  A: 

Would it be more efficient to select the minimum time that is larger and the maximum time that is smaller then just abs those two. That should avoid having to operate on the entire table.

SELECT MAX(time) AS prev WHERE time < 1250710000;

SELECT MIN(time) AS next WHERE time > 1250710000;

SELECT MIN(ABS(prev), ABS(next));

My SQL isn't strong enough to combine those into one, and the overhead of three queries might kill any savings, but it might be possible.

Colin Coghill
+6  A: 

Assuming time is indexed, you can get the next record nearly for free:

SELECT * FROM table WHERE time > 1250710000 ORDER BY time LIMIT 1

And if I'm not wrong, the same should apply to the previous record, MySQL will just read the index in reverse order. Use an UNION of the two, order them by date diff and voila! The result will look like this

SELECT *
FROM
(
    (SELECT *, ABS(time - 1250710000) AS time_diff FROM table WHERE time > 1250710000 ORDER BY time ASC LIMIT 1)
    UNION ALL
    (SELECT *, ABS(time - 1250710000) AS time_diff FROM table WHERE time < 1250710000 ORDER BY time DESC LIMIT 1)
) AS tmp
ORDER BY time_diff
LIMIT 1

Ideally, instead of > and < you should use >= and <= and exclude the reference record using its primary id, to account for records sharing the same timestamp.

Josh Davis
Damn! I was just typing almost exactly that!
Sharkey
Great idea, but the reference timestamp (`1250710000` in this case) isn't in the same table.Having said that, I assume that this query is about the same in terms of efficiency?
heyitsme
**@cyouung:** This query is not the same in terms of efficiency. Your query does an `ABS(time - 125071000)` on *every single row*. As long as you have an index on `time`, this query will never read more than two rows.
too much php
Excellent. Thanks!
heyitsme