views:

275

answers:

3

Table has about 8 million rows. There is a non-unique index for X.

Showing indexes, it shows that in the table there is a non-unique index on key name X with "seq_in_index" of 1, collation A, cardinality 7850780, sub_part NULL, packed NULL, index_type BTREE.

Still, this query can take 5 seconds to run. The list of ints comes from another system, and I am not allowed to store them in a table, because they represent friendships on a social network.

Is there a faster way than a massive IN statement?

+12  A: 

You can convert your list of IDs into a temp-table (or table-var if MySql supports them) and join with it.

The table would only live as long as the query so you're not actually storing anything in a table.

Michael Haren
+5  A: 

You could try storing them in a temporary table. This table wouldn't be stored in the database permanently and I think the resulting join (assuming that you index the temporary table as well) would be faster since it would be able to process the indices in parallel and not have to do an index lookup for each int the IN clause. Of course, MySQL may optimize the IN clause and do the same thing if it knows that it will be using an index so it may not actually gain you anything. I would give a try though and see if it is faster.

tvanfosson
+4  A: 

As suggested by others, a temporary table is the most appropriate solution.

Be aware though, that depending on cardinality and the number of rows in your temporary table/in() condition the optimizer may still resort to using a sequential scan because of the fact that sequential reads can be a lot faster than lots of random seeks in the index.

At this point it may be appropriate to consider redesigning the relations.

Jan Jungnickel
+1: Good points about the optimizer and design
Michael Haren
Yes, I am thinking of denormalizing the database such that this query would not be necessary.
Bemmu