views:

1076

answers:

2

We are trying to set up a cursor to run through records generated from a join between two 'instances' of the same huge table (more than 150 M records).

The following exception message comes out:

Could not allocate space for object 'dbo.SORT temporary run storage: 165282123350016' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

Do any of you know the reason for this? Or how to make the query below below more efficient?

I have found that it occurs somewhere between DECLARE CURSOR and the first FETCH NEXT, but I do not know yet if it is between...

  • DECLARE CURSOR and OPEN

or between

  • OPEN and the first FETCH NEXT.

More details: The sql statement looks like:

DECLARE cData CURSOR LOCAL FORWARD_ONLY READ_ONLY
FOR
  SELECT ...
  FROM HugeTable HT1 JOIN HugeTable HT2 ON .. 
  JOIN Table3 ON .. JOIN Table4 ON .. JOIN Table5 ON ..
  WHERE ...
  ORDER BY HT1..., HT1...

INSERT INTO SysLog (Description) VALUES ('A')

OPEN cData
BEGIN TRANSACTION ProcessData
  -- Currently trying new logging here:
  -- INSERT INTO SysLog (Description) VALUES ('B') 
  FETCH NEXT FROM cData INTO ...
  INSERT INTO SysLog (Description) VALUES ('C')
  ... etc.

where the last log message I get is 'A' and then one hour later it fails with the message described above, never reaching 'C'. I am now trying with logging at point 'B'.


On request I post the exact sql expression:

DECLARE cSource CURSOR LOCAL FORWARD_ONLY READ_ONLY
FOR
    SELECT MD.sFieldName, 
     MD.sFieldValue, 
     TR.sTargetDataType,
     MD2.sFieldValue AS sUniqueID,
     TR.sTargetTableName,
     TR.sTargetFieldName,
     I.iRefCustomerID, 
     I.iInterfaceID, 
     IL.iRefInterfaceSessionID
    FROM MasterData MD
    JOIN MasterData MD2
     ON MD.iRowIndex = MD2.iRowIndex
     AND MD.iBatchNumber = MD2.iBatchNumber
     AND MD.sTableName = MD2.sTableName 
     AND MD2.sFieldName = 'sUniqueID'
    JOIN SourceTargetRelation TR
     ON MD.sFieldName = TR.sSourceFieldName
     AND MD.sTableName = TR.sSourceTableName
    JOIN InterfaceLog IL
     ON IL.iInterfaceLogID = MD.iBatchNumber
    JOIN Interface I
     ON I.iInterfaceID = IL.iRefInterfaceID
     AND TR.iRefSystemID = I.iRefSystemID
    WHERE
     MD.iBatchNumber = @iBatchNumber
    ORDER BY MD.sTableName, MD.iRowIndex


After the updated answer from Quassnoi, I also post the original index on the table:

I have a nonclustered index on this table with the columns iBatchNumber, sFieldName, sTableName, iRowIndex. And that index has sFieldValue as an included column.


As Quassnoi suggested (and I think I understand why now) I have changed the index to have the columns in this order: iBatchNumber, sTableName, iRowIndex, sFieldName. And I use sFieldValue as an included column. The execution plan does not contain any SORT anymore, and the number of steps in the execution plan is less than half of original, which I hope is also faster...

+5  A: 

Do any of you know the reason for this? Or how to make the query below below more efficient?

Your query uses ORDER BY.

This needs sorting and sorting needs temporary space. You are out of this space.

To avoid this, create a composite index on your huge table: (col_filter_1, col_filter_2, col_order_1, col_order_2), where col_filter_n are the columns you filter on, and col_order_n are the columns you order by.

Such an index can be used both for filtering and ordering the filtered results.

If you post your actual query (that is expressions you filter on and order by), I'll probably can tell you more exactly how to create such an index.

Update:

From your query, I can see that you need an index on (iBatchNumber, sTableName, iRowIndex, sFieldName) (in that order).

It may also help if you make MD2 leading in the join:

WHERE
    MD2.iBatchNumber = @iBatchNumber
ORDER BY
    MD2.sTableName, MD2.iRowIndex

See the execution plan and make sure that no SORT operation is used.

Quassnoi
Thanks for the comments. I added the exact query to the question.
Ole Lynge
Presently I have a nonclustered index on the same columns but in a different order: iBatchNumber, sFieldName, sTableName, iRowIndex. In addition the present index includes the column sFieldValue. Should I add the index you suggest, and then keep the existing or delete the existing?
Ole Lynge
Brilliant answer, I'd say. Thanks, it seems to help...
Ole Lynge
Your existing index didn't help because you didn't filter MD on sFieldName. If you change the WHERE condition as shown in the post, your existing index will work.
Quassnoi
A: 

Why are you using a cursor? Especially on a large table? What are you doing that can't be done set-based? Cursprs are extremely bad for performance and should not be used if another alternative exists. If you are inserteding to another table based on records found by your select that can be done much better without a cursor.

HLGEM
It's not the cursor that matters here, it's a not index friendly query.
Quassnoi
Thanks for the comment. I have not (yet) found a way to bend the multiple-line logic for each row into a single statement. Might want to look into that if I want to speed it up...
Ole Lynge