tags:

views:

92

answers:

6

I have the following SQL query:

SELECT * FROM table WHERE field_1 <> field_2

Which is the best index structure to use, to keep this query efficient: two indexes on field_1 and field_2 or a single index which includes both fields?

EDIT: The database is MySQL

+1  A: 

I imagine this may depend on which platform you are using, but on MS SQL Server definitely one index!

Tao
Forgot to specify the database engine: MySQL
Enrico Detoma
A: 

It depends on your database engine, but generally it's best to assume that a query will only use one index per table. This would imply that a single index across both columns is likely to be best.

However, the only way to find out is to populate a table with dummy data and try it out. Make sure that the dummy data is representative in terms of how it is distributed as, for example, if 99% of field2 values are identical to each other then it may reduce the value of having an index.

Karl B
A: 

To be sure, I'd try all three options, but remember you are writing to each index with every insert / update. (so indexing both fields will have to be more beneficial by a margin to compensate for the negative effects on write performance) Remember, it doesn't have to be perfect, it just has to be good enough to handle the system throughput without creating unacceptable UI performance latencies.

What I'd try first is A single index on the field that has the most distinct values... i.e. if Field1 has 1000 different values in it, and field 2 only has 20, then put the index on field1.

Charles Bretana
A: 

Here's a nice article about indexes and inequality matches:

http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

alternatively, if your data is vast, you might consider using a trigger to mark another column with a bit, indiciating if the columns match or not, and then search on that column. All depends on your situation, of course.

davek
+1  A: 

If you have a enormous table better is to denormalize it and store the result of filed1<>field2 in separate column, and update it on every insert/update of the corresponding row

Ilian Iliev
That is an option I thought of, but I should use a trigger to perform the comparison atomically and store it in the separate field (I use a PHP framework, so I don't have control over the update query), and I would like to avoid triggers in my database, if possibile.
Enrico Detoma
If you use MVC framework I assume you have some kind of insert/update method defined for your model, so you can make the calculation there and add the result to the data
Ilian Iliev
+1  A: 

Indexes are not going to help you.

The databse must do a table scan, as it is comparing two fields in the same row.

Phil Wallach