views:

42

answers:

3

Hi, i'm often dealing with some interfaces between two systems with data-import or data-export. Therefore i'm programming some t-sql procedures. It's is often necessary to use some vaiables inside the procedure, to hold some values or single records. The last time i set up some temp-tables e.g. one with name #tmpGlobals and another named #tmpOutput. The names doesn't matter but i eliminated the use of declaring some @MainID int or like that.

Is this a good idea? Is it a performance issue?

A: 

It really depends on the amount of data. If you're using under 100 records, then DECLARE @MainID or the like is probably better since it's a smaller amount of data. Anything over 100 records though, you should definitely use #tmpGlobals or similar since it's better for memory management on the SQL server.

EDIT: It's not bad to use #tmpGlobals for smaller sets, just not much of a performance loss or gain from DECLARE @MainID. You will see a performance gain when using #tmpGlobals, instead of DECLARE @MainID, on a high number of records.

Alexander Kahoun
i recognized some big performance lack using some @variables in a cursor-loop. changing this to a solution without declared @variables i got a factor of 10 faster execution...but this was on SQL2000
Ice
In that scenario, cursors are the culprit. Cursors are much slower by nature.
meklarian
meklarian, that is quite a generalization. You should ask Itzik Ben-Gan about some of his cursor-based solutions to realistic problems which outperform set-based solutions by orders of magnitude.
Aaron Bertrand
A: 

In general, you should choose the reverse if possible. It depends on whether you need to store a set of items or just result values.

Scoped variables, aside from table variables, are relatively cheap. Things that fit into typed variables that aren't tables, operate faster than storing them as single rows in a table.

Table variables and temp tables tend to be quite expensive. They may require space in tempdb and also offer no optimizations by default. In addition, table variables should be avoided for large sets of data. When processing large sets, you can apply indexes and define primary keys on temp tables if you wish, but you cannot do this for table variables. Finally, temp tables need cleanup before exiting scope.

For parameters, table variables are useful for return sets from functions. Temp tables cannot be returned from functions. Depending on the task at hand, use of functions may make it easier to encapsulate specific portions of work. You may find that some stored procedures are doing work that is better suited to functions, especially if you are reusing but not modifying the results.

Finally, if you just need one-time storage of results in the middle of stored-procedure work, try CTEs. These usually beat out both table variables and temp tables, as SQL server can make better decisions on how to store this data for you. Also, as a matter of syntax, it may make your declarations more legible.

Using Common-Table Expressions @ MSDN

edit: (regarding temp tables)

Local temp tables go away when the query session ends, which can be an indeterminate amount of time away in the future. Global temp tables don't go away until the connection is closed and no other users are using the table, which can be even longer. In either case, it is best to drop temp tables (as no longer needed) on exit of a procedure to avoid tying up memory and other resources.

CTEs can be used to avert this, in many cases, because they are only local to the location where they are declared. They automatically are marked for cleanup once the stored procedure or function of their scope exits.

meklarian
"Finally, temp tables need cleanup before exiting scope": They don't 'go away' by themselves ?
Ice
They expire once the session does, but if you're testing in Management Studio you'd better get used to placing DROP TABLE #tblTemp as soon as you're done with the temp table in the code. It's a good habit to clean them up anyway :)
CodeByMoonlight
I agree that it's a good habit to clean them up. Though neither DROP TABLE #foo nor existing scope destroys the #temp table immediately; there is a background process that comes around and removes any tables that are marked for destruction. There are actually perfmon counters that allow you to see your rate of #table creation / destruction.
Aaron Bertrand
Oops, Aaron's post imply that if a storeproc uses many #temptables and is very often exectuted, within a short time there are hundreds of #temptables to be destroyed by this background process. Is this something like a performance killer?Should one define the ##temptables like that, more global, to avoid this sideeffect? Have to truncate it at start of proc if it exists...
Ice
Using a global temptable (##temptable) is worse. There are two ways of getting to hundreds of #temptables: 1. making hundreds of temp tables in 1 stored procedure (unlikely); 2. calling a stored procedure that creates just 1 temp table (likely). If you use a global temp table, you have to guard against multiple connections competing for access and a myriad of other concurrency bugs. In any case, the idea is to optimize the best you can, and if you must use a temp table, #2 is fine.
meklarian
Ice, this *can* be a performance killer, though on current builds of SQL Server it is unlikely. [I actually uncovered a bug in SQL Server 2005 where essentially the engine gave up trying to dispose of these temp tables, eventually leading to having to restart our main production cluster weekly to clear out tempdb. I was issued a private hotfix but this later appeared in a pre-SP3 cumulative update (and as far as I can tell has also been fixed in SQL Server 2008).]
Aaron Bertrand
+1  A: 

As Alexander suggests, it really depends. I won't draw hard lines in the sand about number of rows, because it can also depend on the data types and hence the size of each row. Where one will make more sense than the other in your environment can depend on several factors aside from just the size of the data, including access patterns, sensitivity of performance to recompiles, your hardware, etc.

There is a common misconception that @table variables are only in memory, do not incur I/O, do not use tempdb, etc. While in certain isolated cases some of this is true, it is not something you can or should rely on.

Some other limitations of @table variables that may prevent your use of them, even for small data sets:

  • cannot index (other than primary key / unique constraint declarations on creation)
  • no statistics are maintained (unlike #temp tables)
  • cannot ALTER
  • cannot use as INSERT EXEC target in SQL Server 2005 (this restriction was lifted in 2008)
  • cannot use as SELECT INTO target
  • cannot truncate
  • can't use an alias type in definition
  • no parallelism
  • not visible to nested procs (unlike #temp tables)
Aaron Bertrand
If i have millions of rows in a table and have to select some hundert of thousands out to update or delete them i use temptables to store the intermediate result. Just to avoid a huge lock of the whole the table. Is this a good strategy?
Ice
To expand your list for @table variables: o No transactionBut this apply also for #temptables.
Ice
Ice, I was hoping to avoid the inevitable response, but: it depends. You should test the impact of doing both and see how badly a straight select affects concurrency compared with the I/O costs of dumping that data into a #temp table. I have seen situations where one is better than the other, and vice versa, depending on too many factors to list in a comment. As for transactions, right, I was trying to point out what @table variables lack that #temp tables have, since number of rows is not necessarily the only factor for deciding which to use.
Aaron Bertrand