views:

675

answers:

5

I've developed a couple of T-SQL stored procedures that iterate over a fair bit of data. The first one takes a couple of minutes to run over a year's worth of data which is fine for my purposes. The second one, which uses the same structure/algorithm, albeit over more data, takes two hours, which is unbearable.

I'm using SQL-Server and Query-Analyzer. Are there any profiling tools, and, if so, how do they work?

Alternatively, any thoughts on how improve the speed, based on the pseudo-code below? In short, I use a cursor to iterate over the data from a straight-forward SELECT (from a few joined tables). Then I build an INSERT statement based on the values and INSERT the result into another table. Some of the SELECTed variables require a bit of manipulation before INSERTion. The includes extracting some date parts from a date value, some basic float operations and some string concatenation.

--- Rough algorithm / pseudo-code

DECLARE <necessary variables>
DECLARE @cmd varchar(1000)
DECLARE @insert varchar(100) = 'INSERT INTO MyTable COL1, COL2, ... COLN, VALUES('

DECLARE MyCursor Cursor FOR
    SELECT <columns> FROM TABLE_1 t1
    INNER JOIN TABLE_2 t2 on t1.key = t2.foreignKey
    INNER JOIN TABLE_3 t3 on t2.key = t3.foreignKey

OPEN MyCursor
FETCH NEXT FROM MyCursor INTO @VAL1, @VAL2, ..., @VALn
WHILE @@FETCH_STATUS = 0
BEGIN
   @F = @VAL2 / 1.1  --- float op
   @S = @VAL3 + ' ' + @VAL1
   SET @cmd = @insert
   SET @cmd = @cmd + DATEPART(@VAL1) + ', '
   SET @cmd = @cmd + STR(@F) + ', '
   SET @cmd = @cmd + @S + ', '
   SET @cmd = @cmd + ')'
   EXEC (@cmd)
   FETCH NEXT FROM MyCursor @VAL1, @VAL2, ..., @VALn
END
CLOSE MyCursor
DEALLOCATE MyCursor
+11  A: 

The first thing to do - get rid of the cursor...

INSERT INTO MyTable COL1, COL2, ... , COLN
SELECT ...cols and manipulations...
FROM TABLE_1 t1
INNER JOIN TABLE_2 t2 on t1.key = t2.foreignKey
INNER JOIN TABLE_3 t3 on t2.key = t3.foreignKey

Most things should be possible direct in TSQL (it is hard to be definite without an example) - and you could consider a UDF for more complex operations.

Marc Gravell
+1 - Cursors are evil.
Otávio Décio
I'm getting the message loud and clear about cursors. Anyone care to explain what's so evil about them?
dave
In this case you don't need a cursor. The 'Cursors are evil' meme comes from the fact that cursor operations do things one row at a time. Often they can be re-cast as operations that work on whole sets of data in one go and are much more efficient, allowing the query optimiser to do work once for the whole set that it might have had to do for each row with a cursor. Most (although not all) of the time a cursor is unnecessary - t-sql is particularly good for this through its convenient temporary tables.
ConcernedOfTunbridgeWells
See here for why cursors are bad: http://stackoverflow.com/questions/58141/why-is-it-considered-bad-practice-to-use-cursors-in-sql-server. Note that they do have a place, you should just give serious consideration to whether they're right for a particular problem. If you find yourself answering yes more than 10% of the time you're either dealing in a slightly odd problem space or need to reevaluate your criteria for whether they're right.
Jon Hopkins
I agree with Tyranosaurus except I would say 1% of the time not 10%
HLGEM
There are a couple of good answers here, but this one was closest to the solution I implemented hence it gets the tick. Thanks to everyone who answered. My new code is much shorter, cleaner and above all faster. I procssed a whole month's data in < 30 seconds. Sweet.
dave
+5  A: 

Lose the cursor. Now. (See here for why: http://stackoverflow.com/questions/58141/why-is-it-considered-bad-practice-to-use-cursors-in-sql-server).

Without being rude you seem to be taking a procedural programmers approach to SQL which is pretty much always going to be sub-optimal.

If what you're doing is complex and you're not confident I'd do it in three steps:

1) Select of the core data into a temporary table using insert or select into.

2) Use update to do the manipulation - you may be able to do this just updating existing columns or you may need to have added a few extra ones in the right format when you create the temporary table. You can use multiple update statements to break it down further if you want.

3) Select it out into wherever you want it.

If you want to call it all as one step then you can then wrap the whole thing up into a stored procedure.

This makes it easy to debug and easy for someone else to work with if they need to. You can break your updates down into individual steps so you can quickly identify what's gone wrong where.

That said I don't believe that what you're doing can't be done in a single insert statement from the looks of it. It might not be attractive but I believe it could be done:

INSERT INTO NewTable
DATEPART(@VAL1) DateCol, 
@STR(@VAL2 / 1.1) FloatCol,
@VAL3 + ' ' + @VAL1 ConcatCol
FROM TABLE_1 t1
INNER JOIN TABLE_2 t2 on t1.key = t2.foreignKey
INNER JOIN TABLE_3 t3 on t2.key = t3.foreignKey

DateCol, FloatCol and ConcatCol are whatever names you want the columns to have. Although they're not needed it's best to assign them as (a) it makes it clearer what you're doing and (b) some languages struggle with unnamed columns (and handle it in a very unclear way).

Jon Hopkins
You're not being rude, it was a good call. I'm mostly a procedural programmer not an SQL expert. I'll see if I can follow your suggestion.
dave
+1 for proposing staging tables for complex work
Chris W
A: 

get rid of the cursor and dynamic sql:

INSERT INTO MyTable 
        (COL1, COL2, ... COLN)
    SELECT 
        <columns>
            ,DATEPART(@VAL1) AS DateCol
            ,@STR(@VAL2 / 1.1) AS FloatCol
            ,@VAL3 + ' ' + @VAL1 AS ConcatCol
        FROM TABLE_1        t1    
        INNER JOIN TABLE_2  t2 on t1.key = t2.foreignKey
        INNER JOIN TABLE_3  t3 on t2.key = t3.foreignKey
KM
A: 

Are there any profiling tools, and, if so, how do they work?

To answer your question regarding query tuning tools, you can use TOAD for SQL Server to assist in query tuning.

I really like this tool as it will run your SQL statement something like 20 different ways and compare execution plans for you to determine the best one. Sometimes I'm amazed at what it does to optimize my statements, and it works quite well.

More importantly, I've used it to become a better t-sql writer as I use the tips on future scripts that I write. I don't know how TOAD would work with this script because as others have mentioned it uses a cursor, and I don't use them so have never tried to optimize one.

TOAD is a huge toolbox of SQL Server functionality, and query optimization is only a small part. Incidentally, I am not affiliated with Quest Software in any way.

DavidStein
A: 

SQl Server also comes with a profiling tool called SQL Server Profiler. It's the first pick on the menu under Tools in SSMS.

HLGEM