views:

31

answers:

1

I have a stored procedure that uses a FAST_FORWARD cursor to chronologically loop over a set of ~300k records and assigns them to declaration sets based on the state of a lot of running variables and flags, partly implemented as table variables. I have given a lot of thought on how to do this set-based and I just can't do it. So I am stuck with the cursor approach, and need to optimize this code.

What I have noticed is that the first 10% of progress is loaded and processed very quickly (2000 rows/sec), near 20% progress it has slowed to about 300 rows/sec and in the end it has slowed down to about 60 rows/sec.

IMO this can be due to 4 reasons:

  • The cursor slows down, which I would think unlikely with a FAST_FORWARD cursor
  • The processing slows down. For my "groups" I am using table variables in which I am inserting, updating and deleting. At any given moment there are max. about 10 rows in those variables.
  • The inserting into target tables slows down. I don't see why this would be, I have no triggers defined on them and they are just ordinary tables.
  • Evil magic

Either that or my percentage counter is broken:

SET @curprogress = @curprogress + 1
IF (@curprogress - ((@totprogress / 100) * (FLOOR(@curprogress * 100 / @totprogress)))) BETWEEN 0 AND 1 BEGIN
    SET @msg = CAST(FLOOR(@curprogress * 100 / @totprogress) AS VARCHAR)
    RAISERROR('%s%s', 0, 1, @msg, '%...') WITH NOWAIT;
END

Has anybody any clue what to look for and how to go on speeding up this query?

Symbolic exerpt of my code:

WHILE....
-- Fetch new record to be assigned to one of the open declaration sets
FETCH NEXT INTO @row_field1, @row_field2....
IF (@flag2 = 1) AND ((@flag1 = 0) OR (@row_field1 <> @prevrow_field1)) 
BEGIN
    -- Logging info: we are closing a child declaration set
    INSERT INTO @logtable SELECT '--> LOG MESSAGE'
    INSERT INTO @logtable
    SELECT format_message(@row_field1, @calc_field2, field3...)
    FROM @runningtable_sub S LEFT JOIN @runningtable_main M ON S.MainID = M.ID

    -- Update enddate of parent
    UPDATE M SET M.enddate = DATEADD(day,365,S.enddate)
    FROM @runningtable_sub S
    LEFT JOIN @runningtable_main M
    ON S.MainID = M.ID

    -- close and save child
    INSERT INTO outputtable_main
    SELECT @field1, COALESCE(Z.Field1,'NULL'), S.startdate, S.enddate,
        M.Startdate, M.Enddate
    FROM @runningtable_sub S 
    LEFT JOIN @runningtable_main M ON S.MainID = M.ID

    -- delete child from running table
    DELETE FROM @runningtable_sub WHERE S.enddate < @curdate
END
A: 

I could think of lots of reasons this is slowing down, but it is hard to narrow it down without knowing the cardinality of your data.

Random observations:

  1. Your format_message() function is surely a dog; all UDFs are. But how many rows are you inserting per pass?
  2. @runningtable_main never gets cleared out.
  3. updates are expensive
  4. deletes are expensive. If you use temp tables and rework your implementation, you can truncate instead of delete

To figure some of it out for yourself, add instrumentation:

DECLARE @now DATETIME, @duration INT, @rowcount INT

WHILE....
-- Fetch new record to be assigned to one of the open declaration sets
FETCH NEXT INTO @row_field1, @row_field2....
IF (@flag2 = 1) AND ((@flag1 = 0) OR (@row_field1 <> @prevrow_field1)) 
BEGIN
    PRINT '---------------'

    -- Logging info: we are closing a child declaration set
    INSERT INTO @logtable SELECT '--> LOG MESSAGE'
    SET @now = GETDATE()
    INSERT INTO @logtable
    SELECT format_message(@row_field1, @calc_field2, field3...)
    FROM @runningtable_sub S LEFT JOIN @runningtable_main M ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) inserted into @logtable, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- Update enddate of parent
    SET @now = GETDATE()
    UPDATE M SET M.enddate = DATEADD(day,365,S.enddate)
    FROM @runningtable_sub S
    LEFT JOIN @runningtable_main M
    ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) updated in @runningtable_main, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- close and save child
    SET @now = GETDATE()
    INSERT INTO outputtable_main
    SELECT @field1, COALESCE(Z.Field1,'NULL'), S.startdate, S.enddate,
        M.Startdate, M.Enddate
    FROM @runningtable_sub S 
    LEFT JOIN @runningtable_main M ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) inserted into outputtable_main, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- delete child from running table
    SET @now = GETDATE()
    DELETE FROM @runningtable_sub WHERE S.enddate < @curdate
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) deleted from @runningtable_sub, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT
END

If you included your entire code (including the cursor declaration), rather than just a symbolic excerpt, someone here could probably rework it for you to be much more efficient and/or avoid cursors all together.

Peter
I can't post the whole code, will try to post more though. The main table gets cleared out elsewhere. I am inserting between 0-3 rows per pass. Are updates and deletes also expensive on table variables? The cursor is on a view which normally loads in 10-16 seconds. I changed the cursor to a table variable and a `WHILE` loop, but no change. So the problem must be somewhere in the body.
littlegreen
It turned out to be the `UPDATE FROM .. LEFT JOIN` statement, using your timing instrumentation. It also goes slower with every row. Strangely the problem has now disappeared by itself, even though I made no noticable changes to the SP. I am at a loss, but I rewrote the procedure with a grouped input so I don't have to use running tables anymore but work with running variables. This is a lot faster in itself and there are no `UPDATE` statements anymore. I'll accept your answer for the instrumentation bit.
littlegreen