views:

4872

answers:

4

I have a table with index (autoincrement) and integer value. The table is millions of rows long.

How can I search if a certain number appear in the last n rows of the table most efficiently?

+2  A: 

Take advantage of SORT and LIMIT as you would with pagination. If you want the ith block of rows, use OFFSET.

SELECT val FROM big_table
where val = someval
ORDER BY id DESC
LIMIT n;

In response to Nir: The sort operation is not necessarily penalized, this depends on what the query planner does. Since this use case is crucial for pagination performance, there are some optimizations (see link above). This is true in postgres as well "ORDER BY ... LIMIT can be done without sorting " E.7.1. Last bullet

explain extended select id from items where val = 48 order by id desc limit 10;
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | items | const | PRIMARY       | PRIMARY | 4       | const |    1 | Using index | 
+----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+
Dana the Sane
You can't use MAX() or COUNT() in a WHERE clause.
Bill Karwin
Replaced incorrect code with a CASE statement.
Dana the Sane
Case doesn't strip out NULL's so I removed that example altogether.
Dana the Sane
In the current example the db engine will have to look through all n rows even if it finds the value at the first one. Not too omptimized.
Nir
In my non-stringent test, mysql 5.0.x uses an index, as above.
Dana the Sane
+6  A: 

Starting from the answer given by @chaos, but with a few modifications:

  • You should always use ORDER BY if you use LIMIT. There is no implicit order guaranteed for an RDBMS table. You may usually get rows in the order of the primary key, but you can't rely on this, nor is it portable.

  • If you order by in the descending order, you don't need to know the number of rows in the table beforehand.

  • You must give a correlation name (aka table alias) to a derived table.

Here's my version of the query:

SELECT `id`
FROM (
    SELECT `id`, `val`
    FROM `big_table`
    ORDER BY `id` DESC
    LIMIT $n
) AS t
WHERE t.`val` = $certain_number;
Bill Karwin
I would upvote this several times if I could, all the other answers are buggy. This is what I would've posted too.
Ray Hidayat
Note that the answer from @chaos to which I refer has been deleted.
Bill Karwin
Thanks! How many records will the engine have to scan to get the answer? Is it n? or m= the location of val from the top?
Nir
I assume it would have to scan only n rows. The ORDER BY...LIMIT can be done using an index, without scanning rows.
Bill Karwin
+1  A: 

because it is autoincrement, here's my take:

Select * from tbl 
where certainconditionshere 
and autoincfield >= (select max(autoincfield) from tbl) - $n
Michael Buen
I would just change the order of conditions in the WHERE clause.
Itay Moav
This answer assumes the last $n rows have contiguous id values.
Bill Karwin
A: 

I had the same problem. Follow my link to see a solution done completely in MySQL. PHP is only used to retrieve and display the final result.