I'm not a DBA ("Good!", you'll be thinking in a moment.)
I have a table of logging data with these characteristics and usage patterns:
- A
datetime
column for storing log timestamps whose value is ever-increasing and mostly (but only mostly) unique - Frequent-ish inserts (say, a dozen a minute), only at the end of the timestamp range (new data being logged)
- Infrequent deletes, in bulk, from the beginning of the timestamp range (old data being cleared)
- No updates at all
- Frequent-ish selects using the timestamp column as the primary criterion, along with secondary criteria on other columns
- Infrequent selects using other columns as the criteria (and not including the timestamp column)
- A good amount of data, but nowhere near enough that I'm worried much about storage space
Additionally, there is currently a daily maintenance window during which I could do table optimization.
I frankly don't expect this table to challenge the server it's going to be on even if I mis-index it a bit, but nevertheless it seemed like a good opportunity to ask for some input on SQL Server clustered indexes.
I know that clustered indexes determine the storage of the actual table data (the data is stored in the leaf nodes of the index itself), and that non-clustered indexes are separate pointers into the data. So in query terms, a clustered index is going to be faster than a non-clustered index -- once we've found the index value, the data is right there. There are costs on insert and delete (and of course an update changing the clustered index column's value would be particularly costly).
But I read in this answer that deletes leave gaps that don't get cleaned up until/unless the index is rebuilt.
All of this suggests to me that I should:
- Put a clustered index on the timestamp column with a 100% fill-factor
- Put non-clustered indexes on any other column that may be used as a criterion in a query that doesn't also involve the clustered column (which may be any of them in my case)
- Schedule the bulk deletes to occur during the daily maintenance interval
- Schedule a rebuild of the clustered index to occur immediately after the bulk delete
- Relax and get out more
Am I wildly off base there? Do I need to frequently rebuild the index like that to avoid lots of wasted space? Are there other obvious (to a DBA) things I should be doing?
Thanks in advance.