How should I manage tables that refer to site 'events'. i.e. certain activities a user has done on a website that I use for tracking. I want to be able to do all kinds of datamining and correlation between different activities of users and what they have done.
Today alone I added 107,000 rows to my SiteEvent table. I dont think this is sustainable!
The database is SQL Server. I'm mainly referring to best practice activites with respect to managing large amounts of data.
For instance :
- Should I keep these tables in a database all of their own? If i need to join with other tables this could be a problem. Currently I just have one database with everything in.
- How ought I to purge old records. I want to ensure my db file doesnt keep growing.
- Best practices for backing up and truncating logs
- Will adding additional indexes dramatically increase the size of the DB with so many records?
- Any other things i need to so in SQL Server that might come back to bite me later?
FYI: these are the tables
CREATE TABLE [dbo].[SiteEvent](
[SiteEventId] [int] IDENTITY(1,1) NOT NULL,
[SiteEventTypeId] [int] NOT NULL,
[SiteVisitId] [int] NOT NULL,
[SiteId] [int] NOT NULL,
[Date] [datetime] NULL,
[Data] [varchar](255) NULL,
[Data2] [varchar](255) NULL,
[Duration] [int] NULL,
[StageSize] [varchar](10) NULL,
and
CREATE TABLE [dbo].[SiteVisit](
[SiteVisitId] [int] IDENTITY(1,1) NOT NULL,
[SiteUserId] [int] NULL,
[ClientGUID] [uniqueidentifier] ROWGUIDCOL NULL CONSTRAINT [DF_SiteVisit_ClientGUID] DEFAULT (newid()),
[ServerGUID] [uniqueidentifier] NULL,
[UserGUID] [uniqueidentifier] NULL,
[SiteId] [int] NOT NULL,
[EntryURL] [varchar](100) NULL,
[CampaignId] [varchar](50) NULL,
[Date] [datetime] NOT NULL,
[Cookie] [varchar](50) NULL,
[UserAgent] [varchar](255) NULL,
[Platform] [int] NULL,
[Referer] [varchar](255) NULL,
[RegisteredReferer] [int] NULL,
[FlashVersion] [varchar](20) NULL,
[SiteURL] [varchar](100) NULL,
[Email] [varchar](50) NULL,
[FlexSWZVersion] [varchar](20) NULL,
[HostAddress] [varchar](20) NULL,
[HostName] [varchar](100) NULL,
[InitialStageSize] [varchar](20) NULL,
[OrderId] [varchar](50) NULL,
[ScreenResolution] [varchar](50) NULL,
[TotalTimeOnSite] [int] NULL,
[CumulativeVisitCount] [int] NULL CONSTRAINT [DF_SiteVisit_CumulativeVisitCount] DEFAULT ((0)),
[ContentActivatedTime] [int] NULL CONSTRAINT [DF_SiteVisit_ContentActivatedTime] DEFAULT ((0)),
[ContentCompleteTime] [int] NULL,
[MasterVersion] [int] NULL CONSTRAINT [DF_SiteVisit_MasterVersion] DEFAULT ((0)),