views:

70

answers:

4

I have a "history" table where I log each request into a Web Handler on our web site. Here is the table definition:

/****** Object:  Table [dbo].[HistoryRequest]    Script Date: 10/09/2009 17:18:02 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[HistoryRequest](
    [HistoryRequestID] [uniqueidentifier] NOT NULL,
    [CampaignID] [int] NOT NULL,
    [UrlReferrer] [nvarchar](512) NOT NULL,
    [UserAgent] [nvarchar](512) NOT NULL,
    [UserHostAddress] [nvarchar](15) NOT NULL,
    [UserHostName] [nvarchar](512) NOT NULL,
    [HttpBrowserCapabilities] [xml] NOT NULL,
    [Created] [datetime] NOT NULL,
    [CreatedBy] [nvarchar](100) NOT NULL,
    [Updated] [datetime] NULL,
    [UpdatedBy] [nvarchar](100) NULL,
 CONSTRAINT [PK_HistoryRequest] PRIMARY KEY CLUSTERED 
(
    [HistoryRequestID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[HistoryRequest]  WITH CHECK ADD  CONSTRAINT [FK_HistoryRequest_Campaign] FOREIGN KEY([CampaignID])
REFERENCES [dbo].[Campaign] ([CampaignId])
GO

ALTER TABLE [dbo].[HistoryRequest] CHECK CONSTRAINT [FK_HistoryRequest_Campaign]
GO

37 seconds for 1050 rows on this statement:

SELECT * 
  FROM HistoryRequest AS hr 
 WHERE Created > '10/9/2009'
 ORDER BY Created DESC

Does anyone have anysuggestions for speeding this up? I have a Clustered Index on the PK and a regular Index on the CREATED column. I tried a Unique Index and it barfed complaining there is a duplicate entry somewhere - which can be expected.

Any insights are welcome!

+1  A: 

For a log table, you probably don't need a uniqueidentifier column. You're not likely to query on it either, so it's not a good candidate for a clustered index. Your sample query is on "Created", yet there's no index on it. If you query frequently on ranges of "Created" values then it would be a good candidate for clustering even though it's not necessarily unique.

OTOH, the foreign key suggests frequent querying by Campaign, in which case having the clustering done by that column could make sense, and would also probably do a better job of scattering the inserted keys in the indexes - both the surrogate key and the timestamp would add records in sequential order, which is net more work over time for insertions because the node sectors are filled less randomly.

If it's just a log table, why does it have update audit columns? It would normally be write-only.

le dorfier
+1  A: 

Hi Keith!

Hey, I've seen some odd behavior when pulling XML columns in large sets. Try putting your index on Created back, then specify the columns in your select statement; but omit the XML. See how that affects the return time for results.

Stephen M. Redd
I'll play around with this. Thanks!
Keith Barrows
I pulled the XML column out and dropped it in a newly created table (HistoryRequestExtended). This one action freed up my statement and it is backing to screaming fast again. (Even as "wrong" as it is to use SELECT *). Thanks! :)
Keith Barrows
Hey no problem... stuff like this is why we stopped using the XML data type at my shop in our SQL 2005 databases. Instead we tend to store the data in ntext and do all the XML stuff in code. We plan to find out in a few months if SQL 2008 improved this any.
Stephen M. Redd
+3  A: 

You are requesting all columns (*) over a non-covering index (created). On a large data set you are guaranteed to hit the Index Tipping Point where the clustered index scan is more efficient than an nonclustered index range seek and bookmark lookup.

Do you need * always? If yes, and if the typical access pattern is like this, then you must organize the table accordingly and make Created the leftmost clustered key.

If not, then consider changing your query to a coverable query, eg. select only HistoryRequestID and Created, which are covered by the non clustered index. If more fields are needed, add them as included columns to the non-clustered index, but take into account that this will add extra strorage space and IO log write time.

Remus Rusanu
Sorry - the SELECT * was just a toss off in SQL Mangler. I do not write my actual code this way. I was trying to track down an issue and was digging in the data with a "SELECT *". I also did a Query Plan on this exact statement and no table scan was shown though 100% went into an Index Scan.
Keith Barrows
Was it a clustered index Scan or a non-clustere dindex scan? A clustered index scan is the same as a "table scan", since the clustered index *is* the table.
Remus Rusanu
A: 

Rebuild indexes. Use WITH (NOLOCK) clause after the table names where appropriate, this probably applies if you want to run long(ish) running queries against table that are heavily used in a live environment (such as a log file). It basically means your query migth miss some of teh very latest records but you also aren't holding a lock open on the table - which creates additional overhead.

locster
Also consider placign the cluster on the date column if that is going to be used more often.
locster