views:

59

answers:

2

I have a table to log site visits and it has over 100,000 records. There don't seem to be any performance issues, but should large log type tables be regularly moving records to an archive table and clearing out the current table?

+6  A: 

Yes, they should. And the archive table should be placed on a archive filegroup that can be located on a slower, archiving, disk.

There is a high performance way, fast and no copy involved, to do this using paritioning. You switch the partition out from the 'current' table and append it to the 'archive' table. See How to Implement an Automatic Sliding Window in a Partitioned Table on SQL Server 2005. But for a mere 100k records it may be overkill and not necessary.

Remus Rusanu
But partitioning is only available on the Enterprise editions, correct?
OMG Ponies
@Rexem: True. Should be mentioned nonetheless imho. When the volume is so high that the pertitioned approach is required, EE should be deployed anyway (for this and other important EE only features).
Remus Rusanu
@Reamus: Agreed, just making sure about support
OMG Ponies
Enterprise for deployment and also appears in Dev edition so you can develop against it prior to deployment without incurring a huge dev license cost as well.
Andrew
@Dev: Dev edition is basically EE with a different license. Anything that is 'EE' only it means 'EE, Developer and Trial editions only'.
Remus Rusanu
+1  A: 

Yes! A log table should regularly be emptied of its older entries, and these should be moved to an "archive" table. Both tables should have the same structure but not the same list of indexes. A interesting schedule is one where at night, during low traffic, the events of the day are copied to the archive (but remain in the log), and once weekly, the events older than say 2 weeks are removed form the log. Having fewer indexes, INSERTs into the log table is also faster, hence not penalizing to the application(s) producing events.

The advantage of this approach is that the log table can be kept small and with fewer indexes (if any). It is suitable for ad-hoc queries pertaining to events in the last week and up to the events happening in real time.

The archive table is suitable for any query, in particular deeper "mining", aggregation and such, for all events, except those in the last 0 to 24 hours.

mjv