views:

617

answers:

6

Will limiting a query to one result record, improve performance in a large(ish) MySQL table if the table only has one matching result?

for example

 select * from people where name = "Re0sless" limit 1

if there is only one record with that name? and what about if name was the primary key/ set to unique? and is it worth updating the query or will the gain be minimal?

A: 

I believe the LIMIT is something done after the data set is found and the result set is being built up so I wouldn't expect it to make any difference at all. Making name the primary key will have a significant positive effect though as it will result in an index being made for the column.

Owen Fraser-Green
+12  A: 

If the column has

a unique index: no, it's no faster

a non-unique index: maybe, because it will prevent sending any additional rows beyond the first matched, if any exist

no index: sometimes

  • if 1 or more rows match the query, yes, because the full table scan will be halted after the first row is matched.
  • if no rows match the query, no, because it will need to complete a full table scan
John Douthat
Will it prevent a full table scan, or will it cause the full table scan to stop early? If the record is not there, it'll still have to do a full, full table scan.
WW
Yes, you're right. It should have said it "can" prevent a full table scan. I'll adjust the post
John Douthat
A: 

If "name" is unique in the table, then there may still be a (very very minimal) gain in performance by putting the limit constraint on your query. If name is the primary key, there will likely be none.

DannySmurf
+1  A: 

To answer your questions in order: 1) yes, if there is no index on name. The query will end as soon as it finds the first record. take off the limit and it has to do a full table scan every time. 2) no. primary/unique keys are guaranteed to be unique. The query should stop running as soon as it finds the row.

jcoby
A: 

Yes, you will notice a performance difference when dealing with the data. One record takes up less space than multiple records. Unless you are dealing with many rows, this would not be much of a difference, but once you run the query, the data has to be displayed back to you, which is costly, or dealt with programmatically. Either way, one record is easier than multiple.

jle
+2  A: 

If you have a slightly more complicated query, with one or more joins, the LIMIT clause gives the optimizer extra information. If it expects to match two tables and return all rows, a hash join is typically optimal. A hash join is a type of join optimized for large amounts of matching.

Now if the optimizer knows you've passed LIMIT 1, it knows that it won't be processing large amounts of data. It can revert to a loop join.

Based on the database (and even database version) this can have a huge impact on performance.

Andomar