views:

79

answers:

3

I'm reading up on indexes for optimizing database performance and would like to gather best practices for various database situations in terms of applying indexes.

Take a totally non-indexed database with non-trivial amount of tables and rows, what rules do you use in determining what index is applied to what column of what type to achieve maximal performance? What query analyzing tricks to do use?

As a start I got: (from http://asptutorials.net/SQL-Server/tutorial-on-indexes/)

  • For main or "header" tables such as a table of invoices, make a clustered index on the Primary Key of the table.

  • For secondary or "details" tables such as "invoice_row", make a clustered index on the foreign key that groups the child records together (which in this example is "invoice_id"). This is because the majority of queries on the invoice_row table will be made in the invoice_id order rather than the invoice_row_id order.

  • For all tables, make a non-clustered index on each of the foreign keys of the table. Don't concern yourself with covering indexes at this point. Think about the selects queries that you will be performing on the table. What sort of "where" and "order by" statements will you be using? Make a non-clustered index on these columns.

  • It is now time to start timing your typical queries and look for any slow ones. If you identify a particularly slow one, see if there is a way that you can add extra non-key columns to an index so that it becomes a covering index for that query.

What other "rule-of-thumbs" can we collect? What tools to use?

+1  A: 

Once you have defined you primary and foreign key indexes, you need to analyze the transactional activity of your database in order to determine an appropriate indexing strategy.

SQL Server actually suggests indexes that you may wish to incorporate into your database via a number of ways:

  1. The missing Indexes Dynamic Management Views. Suggests indexes that may be missing from your databases by reviewing the execution plans that have been generated on your platform that could have benefited from the existence of an index.
  2. The Database Engine Tuning Advisor. Create a SQL Server Trace to create a workload file and then run it through the DTA in order to generate recommendations.

I would suggest you perform testing to simulate the transactional workload you expect in order to accurately design and tune your database.

John Sansom
+1  A: 

Number 1 -- analyse your SQL.

You need to look at the SQL being thrown at the DB to work out which indexes could possibly be useful. You cannot work out what is required just by looking at the plan analyzer.

Number 2 -- Table space scans are very often good things! If the table is "small" or you are accessing 30% or more of the rows then a scan is often the most efficient access path.

Number 3 -- Indexing "small" tables is pointless.

Number 4 -- Indexes are useful for retrieval too. Tack commonly retreived values to the back of an index -- often the DB will just pick values from the index and not bother retreiving the actual row. e.g. if a common SQL is "select max(item_no) where invoice = ?" if you build an index with invoice *and item_no* then the query can be satisfied without any access to the actual table.

The current value of "small" is approx 2000 rows.

Always remember there is a severe performance penalty on INSERT for every index you build.

James Anderson
+2  A: 

I try to create the index I need and I use the following script in the test or production environment to tune the indexes further. I also have a couple of other procedures as well, but this one gives a good start. Remember that you have to select the best order of the columns by yourself.

This question is similar to this question where i have posted some other stored procedures that i use to tune the indexes.

CREATE PROCEDURE [ADMIN].[spMissingIndexes]
AS
SELECT
      mid.statement,
      mid.equality_columns,
      mid.inequality_columns,
      mid.included_columns,
      migs.user_seeks,
      migs.user_scans,
      migs.last_user_seek,
      migs.avg_user_impact,
      user_scans,
      avg_total_user_cost,
      avg_total_user_cost * avg_user_impact * (user_seeks + user_scans) AS [weight]
FROM
      sys.dm_db_missing_index_group_stats AS migs
      INNER JOIN sys.dm_db_missing_index_groups AS mig
         ON (migs.group_handle = mig.index_group_handle)
      INNER JOIN sys.dm_db_missing_index_details AS mid
         ON (mig.index_handle = mid.index_handle)   
ORDER BY
      avg_total_user_cost * avg_user_impact * (user_seeks + user_scans) DESC ;
GO
Hakan Winther