views:

71

answers:

2

I have one job with around 100K records to process. This job truncates the destination tables, and then insert all the records "one at a time", not a batch insert, in these tables.

I need to know how indexes will take affect as these records are inserted? Whether cost of creating index during the job will be more than benefit from using them?

Are there any best practices or optimization hints in such situation?

+1  A: 

The answer to this question is very different depending if the indexes you're talking about are clustered or not. Clustered indexes force SQL Server to store data in sorted order, so if you try to insert a record that doesn't sort to the bottom the end of your clustered index, your insert can result in significant reshuffling of your data, as many of your records are moved to make room for your new record.

Nonclustered indexes don't have this problem; all the server has to do is keep track of where the new record is stored. So if your index is clustered (most clustered indexes are primary keys, but this is not required; run "sp_helpindex [TABLENAME]" to find out for sure), you would almost certainly be better off adding the index after all of your inserts are done.

As to the performance of inserts on nonclustered indexes, I can't actually tell you; in my experience, the slowdown hasn't been enough to worry about. The index overhead in this case would be vastly outweighed by the overhead of doing all your inserts one at a time.

Edit: Since you have the luxury of truncating your entire table, performance-wise, you're almost certainly better off dropping (or NOCHECKing) your indexes and constraints before doing all of your inserts, then adding them back in at the end.

kcrumley
+1  A: 

This kind of question can only be answered on a case-by-case basis. However the following general considerations may be of help:

  • Unless some of the data for the inserts comes from additional lookups and such, No index is useful during INSERT (i.e. for this very operation, indexes may of course be useful for other queries under other sessions/users....)
  • [on the other hand...] The presence of indexes on a table slows down the INSERT (or more generally UPDATE or DELETE) operations
  • The order in which the new records are added may matter
  • Special consideration is to be had with regards if the table is a clustered index
  • Deciding whether to drop indexes (all of them or some of them) prior to the INSERT operation depends much on the relative number of records (added vs. readily in)
  • INSERT operations may often introduce index fragmentation, which is in of itself an additional incentive to mayve drop the index(es) prior to data load, then to rebuild it (them).

In general, adding 100,000 records is "small potatoes" for MS-SQL, and unless of a particular situation such as unusually wide records, or the presence of many (and possibly poorly defined) constraints of various nature, SQL Server should handle this load in minutes rather than hours on most any hardware configuation.

mjv