views:

86

answers:

2

Hello, I have a very specific query. I tried lots of ways but i couldn't reach the performance i want.

SELECT *
FROM
    items
WHERE
    user_id=1
AND
    (item_start < 20000 AND item_end > 30000)

i created and index on user_id, item_start, item_end

this didn't work and i dropped all indexes and create new indexes

user_id, (item_start, item_end)

also this didn't work.

(user_id, item_start and item_end are int)

edit: database is MySQL 5.1.44, engine is InnoDB

+1  A: 

Do you need to SELECT *? If not, you can create a index on user_id, item_start, item_end with the fields you need in the SELECT-part as included columns. This all assuming you're using Microsoft SQL Server 2005+

Vidar Nordnes
I need all fields, maybe i can split the table in two tables (second table covers text and other big fields) but i don't want to do this for some specific reason.
jsonx
+2  A: 

UPDATE: per your comment below, you need all the columns in the query (hence your SELECT *). If that's the case, you have a few options to maximize query performance:

  1. create (or change) your clustered index to be on item_user_id, item_start, item_end. This will ensure that as few rows as possible are examined for each query. Per my original answer below, this approach may speed up this particular query but may slow down others, so you'll need to be careful.
  2. if it's not practical to change your clustered index, you can create a non-clustered index on item_user_id, item_start, item_end and any other columns your query needs. This will slow down inserts somewhat, and will double the storage required for your table, but will speed up this particular query.

There are always other ways to increase performance (e.g. by reducing the size of each row) but the primary way is to decrease the number of rows which must be accessed and to increase the % of rows which are accessed sequentially rather than randomly. The indexing suggestions above do both.

ORIGINAL ANSWER BELOW:

Without knowing the exact schema or query plan, the main performance problem with this query is that SELECT * forces a lookup back to your clustered index for every row. If there are large numbers of matching rows for a particular user ID and if your clustered index's first column is not item_user_id, then this will likley be a very inefficient operation because your disk will be trying to fetch lots of randomly distributed rows from teh clustered inedx.

In other words, even thouggh filtering the rows you want is fast (because of your index), actually fetching the data is slower. .

If, however, your clustered index is ordered by item_user_id, item_start, item_end then that should speed things up. Note that this is not a panacea, since if you have other queries which depend on different ordering, or if you're inserting rows in a differnet order, you could end up slowing down other queries.

A less impactful solution would be to create a covering index which contains only the columns you want (also ordered by item_user_id, item_start, item_end, and then add the other cols you need). THen change your query to only pull back the cols you need, instead of using SELECT *.

If you could post more info about the DBMS brand and version, and the schema of your table, and we can help with more details.

Justin Grant
i used EXPLAIN with "SELECT *" and "SELECT id" it examined about 16500 rows for each query. Indexes work good with between but this is different.
jsonx