views:

236

answers:

4

I'm looking at educating a team on writing better SQL Server queries and was wondering what people's best hints were for improving performance.

For instance I once had a DBA who insisted that count(*) would perform worse than count(1) (I have no idea whether she was right or whether it's still valid against the latest query optimisers).

What simple things should I be telling the team to try and always use or avoid? I'm ideally looking for things which (a) might make a reasonable difference and (b) are straight forward, 1 - 2 lines to state.

+2  A: 

Best hint: Use SQL Server 2008 and run the Activity Monitor while your tests are running. Note the queries that take longest / have the most I/O, etc. Right-click those queries to view the query and/or the execution plan.

Next: learn to understand execution plans.

Next: Use the database tuning wizard.

These steps will help you generate your own "best hints".

John Saunders
+5  A: 

Query tuning 101

There is no magic silver bullet to query tuning, although I can give you some hints and tips. The first thing to do is to understand what's actually going on behind the scenes. Get a good internals book like the third Guru's Guide book.

Poorly performing queries tend to come in two basic flavours: Transactional queries that take too long, and grinding batch jobs (or reports) that take too long. One good sign of a query with something wrong with it is a single item in the query plan taking 99% of the time.

Transactional queries

On most occasions a poorly performing transactional query is one of a few things:

  • A missing index. You can see this in the query plan - table scans of large tables on a join that should be very selective (i.e. return few rows).

  • Query unable to use an index. If you have OR conditions in the where clause, joins on a calculated value or some other item in the query that is not sarg-able then you may need to re-write the query. Briefly, sargs are query predicates that can use indexes to eliminate rows. And, equality and inequality are all sarg-able. OR is traditionally not sarg-able. However, you can often translate OR's into sarg-able predicates by inverting the sense from OR to NOT (foo and not bar) type constructs.

  • Inefficient predicates. For example, if you have a where in referencing a nested subquery see if it can be re-written as where exists. This may result in more efficient query plans. There are others. Again, the Guru's guide books and others on the subject are a good starting point.

Batch queries

Batch queries are more complicated and have different tuning issues. Some tips are:

  • Indexing. This can make a large difference for the same reason it does with transactional queries.

  • Temporary tables. You may find it better to break down a query into several queries populating temporary tables. Larger queries give the optimiser more room to screw up, although this is less of an issue that it used to be. Make the temp tables with select into and this operation becomes minimally logged, which reduces the I/O load.

    Note that temporary tables in tempdb are the same data structure that the optimiser uses to store intermediate join results, so there is no performance penalty for doing this. You can also create an index on a temp table, which may improve performance of the queries reading it.

    Don't overdo temp tables though, as they can make things harder to trace back through the query. For smaller tables within a stored procedure, test to see if table variables help. These are an in-memory data structure, so they can be a performance win.

  • Clustered indexes. This can improve the performance of a query as it forces locality based on some grouping column. A clustered index may make a large difference to the performance of a batch job.

  • Inefficient predicates. These can cause problems with sargs and other sub-optimisation isses in much the same way as they do with transactional queries.

  • Table scan is your friend. Contrary to popular belief, table scans are not inherently evil. Generally they are a sign of something wrong in a transactional query, but they are often the most efficient way to do a large batch operation. If you are doing something with more than a few percent of rows in a table, a table scan is often the most efficient way to cover the table.

  • Nested loops joins. Take a look at what the optimiser is doing on both sides of the join. These can be inefficient if you are (for example0 table scanning two large tables on both sides of a nested loops join. Consider using clustered indexes or order by and trying to change the operation to a merge join or hinting to promote a hash join if one side is small enough to do this with.

Locking

Locking can also cause performance issues. If your system is performing badly under load, look at the profiler and perfmon counters related to locks and check whether there is any significant contention. sp_who2 has a 'BlkBy' column in the result set that will show if a query is blocked and what is blocking it. Also, profiles with 'deadlock graph' events (if you have queries deadlocking) and lock related events can be useful to troubleshoot lock issues.

ConcernedOfTunbridgeWells
+1 as this is some great information on performance tuning (I have had the pleasure of being in Kalen's classes. She does know what she is on about!). You could just add some information on dynamic views.
Wayne
+1  A: 

Hi,

An excellent Freely available ebook from RedGate regarding how to work with and understand SQL Server Execution Plans

http://www.red-gate.com/specials/Grant.htm?utm_content=Grant080623

Shameless Plug, I reference performance tuning materials on my blog under SQL Server Performance.

Once you have had an opporunity to digest some of this material, please feel free to either post here or contact me directly with specific questions.

John Sansom
+1  A: 

First, indexing. Many people dont realize that foreign keys do not automatically get indexes. Since they are used in joins they almost always should have an index.

Closely examine all cursors to see if they can be replaced by set-based code instead. I have changed code that ran for hours to seconds by doing this.

Avoid subqueries. If you have them in code replace them with joins or joins to derived tables.

Make sure your where clause is sargeable.

Learn to read execution plans.

Make sure the office has a couple of good books on performance tuning.

Table variables are better than temp tables in some instances and temp tables perform better in others, If you need to use them, try both and see which works better in that particular case.

HLGEM