The following query takes about 1 minute to run, and has the following IO statistics:
SELECT T.RGN, T.CD, T.FUND_CD, T.TRDT, SUM(T2.UNITS) AS TotalUnits
FROM dbo.TRANS AS T
JOIN dbo.TRANS AS T2 ON T2.RGN=T.RGN AND T2.CD=T.CD AND T2.FUND_CD=T.FUND_CD AND T2.TRDT<=T.TRDT
JOIN TASK_REQUESTS AS T3 ON T3.CD=T.CD AND T3.RGN=T.RGN AND T3.TASK = 'UPDATE_MEM_BAL'
GROUP BY T.RGN, T.CD, T.FUND_CD, T.TRDT
(4447 row(s) affected) Table 'TRANSACTIONS'. Scan count 5977, logical reads 7527408, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'TASK_REQUESTS'. Scan count 1, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 58157 ms, elapsed time = 61437 ms.
If I instead introduce a temporary table then the query returns quickly and performs less logical reads:
CREATE TABLE #MyTable(RGN VARCHAR(20) NOT NULL, CD VARCHAR(20) NOT NULL, PRIMARY KEY([RGN],[CD]));
INSERT INTO #MyTable(RGN, CD) SELECT RGN, CD FROM TASK_REQUESTS WHERE TASK='UPDATE_MEM_BAL';
SELECT T.RGN, T.CD, T.FUND_CD, T.TRDT, SUM(T2.UNITS) AS TotalUnits
FROM dbo.TRANS AS T
JOIN dbo.TRANS AS T2 ON T2.RGN=T.RGN AND T2.CD=T.CD AND T2.FUND_CD=T.FUND_CD AND T2.TRDT<=T.TRDT
JOIN #MyTable AS T3 ON T3.CD=T.CD AND T3.RGN=T.RGN
GROUP BY T.RGN, T.CD, T.FUND_CD, T.TRDT
(4447 row(s) affected) Table 'Worktable'. Scan count 5974, logical reads 382339, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'TRANSACTIONS'. Scan count 4, logical reads 4547, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#MyTable________________________________________________________________000000000013'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 1420 ms, elapsed time = 1515 ms.
The interesting thing for me is that the TASK_REQUEST table is a small table (3 rows at present) and statistics are up to date on the table. Any idea why such different execution plans and execution times would be occuring? And ideally how to change things so that I don't need to use the temp table to get decent performance?
The only real difference in the execution plans is that the temp table version introduces an index spool (eager spool) operation.