views:

104

answers:

6
ALTER PROCEDURE ReadNews

 @CategoryID INT,
 @Culture TINYINT = NULL,
 @StartDate DATETIME = NULL,
 @EndDate DATETIME = NULL,
 @Start BIGINT, -- for paging
 @Count BIGINT -- for paging

AS
BEGIN
  SET NOCOUNT ON;  

  --ItemType for news is 0
  ;WITH Paging AS
  (
   SELECT news.ID,
     news.Title,
     news.Description,
     news.Date,
     news.Url,
     news.Vote,
     news.ResourceTitle,
     news.UserID,

     ROW_NUMBER() OVER(ORDER BY news.rank DESC) AS RowNumber, TotalCount = COUNT(*) OVER()

   FROM dbo.News news
   JOIN ItemCategory itemCat ON itemCat.ItemID = news.ID
   WHERE itemCat.ItemType = 0 -- news item 
     AND itemCat.CategoryID = @CategoryID
     AND (
       (@StartDate IS NULL OR news.Date >= @StartDate) AND 
       (@EndDate IS NULL OR news.Date <= @EndDate)
      )
     AND news.Culture = @Culture
     and news.[status] = 1

  )  
  SELECT * FROM Paging WHERE RowNumber >= @Start AND RowNumber <= (@Start + @Count - 1)
  OPTION (OPTIMIZE FOR (@CategoryID  UNKNOWN, @Culture UNKNOWN))
END  

Here is the structure of News and ItemCategory tables:

CREATE TABLE [dbo].[News](
 [ID] [bigint] NOT NULL,
 [Url] [varchar](300) NULL,
 [Title] [nvarchar](300) NULL,
 [Description] [nvarchar](3000) NULL,
 [Date] [datetime] NULL,
 [Rank] [smallint] NULL,
 [Vote] [smallint] NULL,
 [Culture] [tinyint] NULL,
 [ResourceTitle] [nvarchar](200) NULL,
 [Status] [tinyint] NULL

 CONSTRAINT [PK_News] PRIMARY KEY CLUSTERED 
(
 [ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [ItemCategory](
 [ID] [bigint] IDENTITY(1,1) NOT NULL,
 [ItemID] [bigint] NOT NULL,
 [ItemType] [tinyint] NOT NULL,
 [CategoryID] [int] NOT NULL,
 CONSTRAINT [PK_ItemCategory] PRIMARY KEY CLUSTERED 
(
 [ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

This query reads news of a specific category (sport, politics, ...). @Culture parameter specifies the language of news, like 0 (english), 1 (french), etc. ItemCategory table relates a news record to one or more categories. ItemType column in ItemCategory table specifies which type of itemID is there. for now, we have only ItemType 0 indicating that ItemID refers to a record in News table.

Currently, I have the following index on ItemCategory table:

CREATE NONCLUSTERED INDEX [IX_ItemCategory_ItemType_CategoryID__ItemID] ON [ItemCategory] 
(
 [ItemType] ASC,
 [CategoryID] ASC
)
INCLUDE ( [ItemID])

and the following index for News table (suggested by query analyzer):

CREATE NONCLUSTERED INDEX [_dta_index_News_8_1734000549__K1_K7_K13_K15] ON [dbo].[News] 
(
 [ID] ASC,
 [Date] ASC,
 [Culture] ASC,
 [Status] ASC
)

With these indexes, when I execute the query, the query executes in less than a second for some parameters, and for another parameters (e.g. different @Culture or @CategoryID) may take up to 2 minutes! I have used OPTIMIZE FOR (@CategoryID UNKNOWN, @Culture UNKNOWN) to prevent parameter sniffing for @CategoryID and @Culture parameters but seems not working for some parameters.

There are currently around 2,870,000 records in News table and 4,740,000 in ItemCategory table.

Now I greatly appreciate any advice on how to optimize this query or its indexes.

update: execution plan: http://img31.imageshack.us/img31/899/readnewsplan.jpg (in this image, ItemNetwork is what I referred to as ItemCategory. they are the same)

A: 

Have you had a look at some of the inbuilt SQL tools to help you with this:

I.e. from the management studio menu:

  • 'Query'->'Display Estimated Execution Plan'
  • 'Query'->'Include Actual Execution Plan'
  • 'Tools'->'Database Engine Tuning Advisor'
James Wiseman
Yes, the index for news table has been advised by Tuning Advisor. I have also seen the actual execution plan and the indexes are being used.
Maysam
A: 

Shouldn't the OPTION OPTIMIZE clause be part of the inner SQL, rather than of the SELECT on the CTE?

devio
A: 

You should look at indexing the culture field in the news table, and the itemid and categoryid fields in the item category table. You may not need all these indexes - I would try them one at a time, then in combination until you find something that works. Your existing indexes do not seem to help your query very much.

Ray
A: 

Really need to see the query plan - one thing of note is you put the clustered index for News on News.ID, but it is not an identity field but the FK for the ItemCategory table, this will result in some fragmentation on the news table over time, so it less than ideal.

I suspect the underlying problem is your paging is causing the table to scan.

Updated:

Those Sort's are costing you 68% of the query execution time from the plan, and that makes sense, one of those sorts at least must be to support the ranking function you are using that is based on news.rank desc, but you have no index that can support that ranking natively.

Getting an index in to support that will be interesting, you can try a simple NC index on news.rank first off, SQL may chose to join indexes and avoid the sort, but it will take some experimentation.

Andrew
I had temporarily disabled the Identity characteristic of ID column on news table. So, consider it as an identity column. I have also updated my question with an image of execution plan. please take a look at it. http://img31.imageshack.us/img31/899/readnewsplan.jpg
Maysam
adding a NC index on news.rank desc made no difference. The execution plan remained the same and the index I created was not used!
Maysam
I did say it would be interesting to get the index to support the sort - the ranking could not be placed in one of the other indexes so unless you convince the optimizer to make an index join it will struggle.
Andrew
A: 

Try using for ItemCategory table nonclustered index on itemId,categoryId and on News table also nonclustered index on Rank,Culture.

Balend
A: 

I have finally come up with the following indexes which are working great and the stored procedure executes in less than a second. I have just removed TotalCount = COUNT(*) OVER() from the query and I couldn't find any good index for that. Maybe I write a separate stored procedure to calculate the total number of records. I may even decide to use a "more" button like in Twitter and Facebook without pagination buttons.

for news table:

CREATE NONCLUSTERED INDEX [IX_News_Rank_Culture_Status_Date] ON [dbo].[News] 
(
    [Rank] DESC,
    [Culture] ASC,
    [Status] ASC,
    [Date] ASC
)

for ItemNetwork table:

CREATE NONCLUSTERED INDEX [IX_ItemNetwork_ItemID_NetworkID] ON ItemNetwork
(
    [ItemID] ASC,
    [NetworkID] ASC
)

I just don't know whether ItemNetwork needs a clustered index on ID column. I am never retrieving a record from this table using the ID column. Do you think it's better to have a clustered index on (ItemID, NetworkID) columns?

Maysam