tags:

views:

118

answers:

1

I have a SQL query that I'm trying to debug. It works fine for small sets of data, but in large sets of data, this particular part of it causes it to take 45-50 seconds instead of being sub second in speed. This subquery is one of the select items in a larger query. I'm basically trying to figure out when the earliest work date is that fits in the same category as the current row we are looking at (from table dr)

ISNULL(CONVERT(varchar(25),(SELECT MIN(drsd.DateWorked) FROM [TableName] drsd
  WHERE drsd.UserID = dr.UserID
        AND drsd.Val1 = dr.Val1
        OR (((drsd.Val2 = dr.Val2 AND LEN(dr.Val2) > 0) AND (drsd.Val3 = dr.Val3 AND LEN(dr.Val3) > 0) AND (drsd.Val4 = dr.Val4 AND LEN(dr.Val4) > 0))
        OR (drsd.Val5 = dr.Val5 AND LEN(dr.Val5) > 0)
        OR ((drsd.Val6 = dr.Val6 AND LEN(dr.Val6) > 0) AND (drsd.Val7 = dr.Val7 AND LEN(dr.Val2) > 0))))), '') AS WorkStartDate,

This winds up executing a key lookup some 18 million times on a table that has 346,000 records. I've tried creating an index on it, but haven't had any success. Also, selecting a max value in this same query is sub second in time, as it doesn't have to execute very many times at all.

Any suggestions of a different approach to try? Thanks!

+4  A: 

Create a composite index on drsd (UserID, DateWorked).

It is also possible that the record distribution in drsd is skewed towards the greater dates, like this:

DateWorked   Condition

01.01.2001   FALSE
02.01.2001   FALSE
…
18.04.2010   FALSE
19.04.2010   TRUE

In this case, the MAX query will need to browse over only 1 record, while the MIN query will have to browse all records from 2001 and further on.

In this case, you'll need to create four separate indexes:

UserId, Val1, DateWorked
UserId, Val2, Val3, Val4, DateWorked
UserId, Val5, DateWorked
UserId, Val6, Val7, DateWorked

and rewrite the subquery:

SELECT  MIN(dateWorked)
FROM    (
        SELECT  MIN(DateWorked) AS DateWorked
        FROM    drsd
        WHERE   UserID = dr.UserID
                AND Val1 = dr.Val1
        UNION ALL
        SELECT  MIN(DateWorked)
        FROM    drsd
        WHERE   UserID = dr.UserID
                AND drsd.Val2 = dr.Val2 AND LEN(dr.Val2) > 0
                AND drsd.Val3 = dr.Val3 AND LEN(dr.Val3) > 0
                AND drsd.Val4 = dr.Val4 AND LEN(dr.Val4) > 0
        UNION ALL
        SELECT  MIN(DateWorked)
        FROM    drsd
        WHERE   UserID = dr.UserID
                AND drsd.Val5 = dr.Val5 AND LEN(dr.Val5) > 0
        UNION ALL
        SELECT  MIN(DateWorked)
        FROM    drsd
        WHERE   UserID = dr.UserID
                AND drsd.Val6 = dr.Val6 AND LEN(dr.Val6) > 0
                AND drsd.Val7 = dr.Val7 AND LEN(dr.Val7) > 0
        ) q

Each query will use its own index and the final query will just select the minimal of the four values (which is instant).

Quassnoi
Thanks! That did the trick.
Spencer