What techniques can be applied effectively to improve the performance of SQL queries? Are there any general rules that apply?
In Oracle you can look at the explain plan to compare variations on your query
- Use primary keys
- Avoid select *
- Be as specific as you can when building your conditional statements
- De-normalisation can often be more efficient
- Table variables and temporary tables (where available) will often be better than using a large source table
- Partitioned views
- Employ indices and constraints
Make sure that you have the right indexes on the table. if you frequently use a column as a way to order or limit your dataset an index can make a big difference. I saw in a recent article that select distinct can really slow down a query, especially if you have no index.
The obvious optimization for SELECT queries is ensuring you have indexes on columns used for joins or in WHERE clauses.
Since adding indexes can slow down data writes you do need to monitor performance to ensure you don't kill the DB's write performance, but that's where using a good query analysis tool can help you balanace things accordingly.
The biggest thing you can do is to look for table scans in sql server query analyzer (make sure you turn on "show execution plan"). Otherwise there are a myriad of articles at MSDN and elsewhere that will give good advice.
As an aside, when I started learning to optimize queries I ran sql server query profiler against a trace, looked at the generated SQL, and tried to figure out why that was an improvement. Query profiler is far from optimal, but it's a decent start.
Learn what's really going on under the hood - you should be able to understand the following concepts in detail:
- Indexes (not just what they are but actually how they work).
- Clustered indexes vs heap allocated tables.
- Text and binary lookups and when they can be in-lined.
- Fill factor.
- How records are ghosted for update/delete.
- When page splits happen and why.
- Statistics, and how they effect various query speeds.
- The query planner, and how it works for your specific database (for instance on some systems "select *" is slow, on modern MS-Sql DBs the planner can handle it).
There are a couple of things you can look at to optimize your query performance.
Ensure that you just have the minimum of data. Make sure you select only the columns you need. Reduce field sizes to a minimum.
Consider de-normalising your database to reduce joins
Avoid loops (i.e. fetch cursors), stick to set operations.
Implement the query as a stored procedure as this is pre-compiled and will execute faster.
Make sure that you have the correct indexes set up. If your database is used mostly for searching then consider more indexes.
Use the execution plan to see how the processing is done. What you want to avoid is a table scan as this is costly.
Make sure that the Auto Statistics is set to on. SQL needs this to help decide the optimal execution. See Mike Gunderloy's great post for more info. Basics of Statistics in SQL Server 2005
Make sure your indexes are not fragmented. Reducing SQL Server Index Fragmentation
- Make sure your tables are not fragmented. How to Detect Table Fragmentation in SQL Server 2000 and 2005