Hi Folks,
I have a serious performance problem.
I have a database with (related to this problem), 2 tables.
1 Table contains strings with some global information. The second table contains the string stripped down to each individual word. So the string is like indexed in the second table, word by word.
The validity of the data in the second table is of less important then the validity of the data in the first table.
Since the first table can grow like towards 1*10^6 records and the second table having an average of like 10 words for 1 string can grow like 1*10^7 records, i use a nolock in order to read the second this leaves me free for inserting new records without locking it (Expect many reads on both tables).
I have a script which keeps on adding and updating rows to the first table in a MERGE statement. On average, the data beeing merged are like 20 strings a time and the scripts runs like ones every 5 seconds.
On the first table, i have a trigger which is beeing invoked on a Insert or Update, which takes the newly inserted or updated data and calls a stored procedure on it which makes sure the data is indexed in the second table. (This takes some significant time).
The problem is that when having the trigger disbaled, Reading the first table happens in a few ms. However, when enabling the trigger and your in bad luck of trying to read the first table while this is beeing updated, Our webserver gives you a timeout after 10 seconds (which is way to long anyways).
I can quess from this part that when running the trigger, the first table is kept (partially) in a lock untill the trigger is completed.
What do you think, if i'm right, is there a easy way around this?
Thanks in advance!
Cheers, Koen
As requested:
-- =============================================
-- Author: Koen Bekkenutte
-- Create date: 05 - 02 - 2010
-- Description: Triggers the indexation of the phrases of a feeditem
-- =============================================
ALTER TRIGGER [dbo].[OnFeedItemsChanged]
ON [dbo].[FeedItems]
AFTER INSERT,UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE @id int;
SELECT @id = ID FROM INSERTED;
IF @id IS NOT NULL
BEGIN
DECLARE @title nvarchar(MAX);
SELECT @title = Title FROM INSERTED;
DECLARE @description nvarchar(MAX);
SELECT @description = [Description] FROM INSERTED;
SELECT @title = dbo.RemoveNonAlphaCharacters(@title)
SELECT @description = dbo.RemoveNonAlphaCharacters(@description)
-- Insert statements for trigger here
EXEC dbo.usp_index_itemstring @id, @title;
EXEC dbo.usp_index_itemstring @id, @description;
END
END
The FeedItems table is populated by this query:
MERGE INTO FeedItems i
USING @newitems d ON i.Service = d.Service AND i.GUID = d.GUID
WHEN matched THEN UPDATE
SET i.Title = d.Title,
i.Description = d.Description,
i.Uri = d.Uri,
i.Readers = d.Readers
WHEN NOT matched THEN INSERT
(Service, Title, Uri, GUID, Description, Readers)
VALUES
(d.Service, d.Title, d.Uri, d.GUID, d.Description, d.Readers);
The sproc: IndexItemStrings is populating the second table, executing this proc does indeed take his time. The problem is that while executing this trigger. Queries applied to the FeedItems table are mostly timing out (even those queries who dont uses the second table)
First table:
USE [ICI]
GO
/****** Object: Table [dbo].[FeedItems] Script Date: 04/09/2010 15:03:31 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[FeedItems](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Service] [int] NOT NULL,
[Title] [nvarchar](max) NULL,
[Uri] [nvarchar](max) NULL,
[Description] [nvarchar](max) NULL,
[GUID] [nvarchar](255) NULL,
[Inserted] [smalldatetime] NOT NULL,
[Readers] [int] NOT NULL,
CONSTRAINT [PK_FeedItems] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[FeedItems] WITH CHECK ADD CONSTRAINT [FK_FeedItems_FeedServices] FOREIGN KEY([Service])
REFERENCES [dbo].[FeedServices] ([ID])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[FeedItems] CHECK CONSTRAINT [FK_FeedItems_FeedServices]
GO
ALTER TABLE [dbo].[FeedItems] ADD CONSTRAINT [DF_FeedItems_Inserted] DEFAULT (getdate()) FOR [Inserted]
GO
Second table:
USE [ICI]
GO
/****** Object: Table [dbo].[FeedItemPhrases] Script Date: 04/09/2010 15:04:47 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[FeedItemPhrases](
[FeedItem] [int] NOT NULL,
[Phrase] [int] NOT NULL,
[Count] [smallint] NOT NULL,
CONSTRAINT [PK_FeedItemPhrases] PRIMARY KEY CLUSTERED
(
[FeedItem] ASC,
[Phrase] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[FeedItemPhrases] WITH CHECK ADD CONSTRAINT [FK_FeedItemPhrases_FeedItems] FOREIGN KEY([FeedItem])
REFERENCES [dbo].[FeedItems] ([ID])
ON UPDATE CASCADE
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[FeedItemPhrases] CHECK CONSTRAINT [FK_FeedItemPhrases_FeedItems]
GO
ALTER TABLE [dbo].[FeedItemPhrases] WITH CHECK ADD CONSTRAINT [FK_FeedItemPhrases_Phrases] FOREIGN KEY([Phrase])
REFERENCES [dbo].[Phrases] ([ID])
ON UPDATE CASCADE
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[FeedItemPhrases] CHECK CONSTRAINT [FK_FeedItemPhrases_Phrases]
GO
And more:
-- =============================================
-- Author: Koen Bekkenutte
-- Create date: 05 - 02 - 2010
-- Description: Extracts and matches phrases from a feeditem string
-- =============================================
ALTER PROCEDURE [dbo].[usp_index_itemstring]
-- Add the parameters for the stored procedure here
@item int,
@text nvarchar(MAX)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- DECLARE a table containing all words within the text
DECLARE @tempPhrases TABLE
(
[Index] int,
[Phrase] NVARCHAR(256)
);
-- extract each word from text and store it in the temp table
WITH Pieces(pn, start, [stop]) AS
(
SELECT 1, 1, CHARINDEX(' ', @text)
UNION ALL
SELECT pn + 1, CAST([stop] + 1 AS INT), CHARINDEX(' ', @text, [stop] + 1)
FROM Pieces
WHERE [stop] > 0
)
INSERT INTO @tempPhrases
SELECT pn, SUBSTRING(@text, start, CASE WHEN [stop] > 0 THEN [stop]-start ELSE LEN(@text) END) AS s
FROM Pieces
OPTION (MAXRECURSION 0);
WITH CombinedPhrases ([Phrase]) AS
(
-- SELECT ALL 2-WORD COMBINATIONS
SELECT w1.[Phrase] + ' ' + w2.[Phrase]
FROM @tempPhrases w1
JOIN @tempPhrases w2 ON w1.[Index] + 1 = w2.[Index]
UNION ALL -- SELECT ALL 3-WORD COMBINATIONS
SELECT w1.[Phrase] + ' ' + w2.[Phrase] + ' ' + w3.[Phrase]
FROM @tempPhrases w1
JOIN @tempPhrases w2 ON w1.[Index] + 1 = w2.[Index]
JOIN @tempPhrases w3 ON w1.[Index] + 2 = w3.[Index]
UNION ALL -- SELECT ALL 4-WORD COMBINATIONS
SELECT w1.[Phrase] + ' ' + w2.[Phrase] + ' ' + w3.[Phrase] + ' ' + w4.[Phrase]
FROM @tempPhrases w1
JOIN @tempPhrases w2 ON w1.[Index] + 1 = w2.[Index]
JOIN @tempPhrases w3 ON w1.[Index] + 2 = w3.[Index]
JOIN @tempPhrases w4 ON w1.[Index] + 3 = w4.[Index]
)
-- ONLY INSERT THE NEW PHRASES IN THE Phrase TABLE
INSERT INTO @tempPhrases
SELECT 0, [Phrase] FROM CombinedPhrases
-- DELETE PHRASES WHICH ARE EXCLUDED
DELETE FROM @tempPhrases
WHERE [Phrase] IN
(
SELECT [Text] FROM Phrases p
JOIN ExcludedPhrases ex
ON ex.ID = p.ID
);
MERGE INTO Phrases p
USING
(
SELECT DISTINCT Phrase FROM @tempPhrases
) t
ON p.[Text] = t.Phrase
WHEN NOT MATCHED THEN
INSERT VALUES (t.Phrase);
-- Finally create relations between the phrases and feeditem,
MERGE INTO FeedItemPhrases p
USING
(
SELECT @item as [Item], MIN(p.[ID]) as Phrase, COUNT(t.[Phrase]) as [Count]
FROM Phrases p WITH (NOLOCK)
JOIN @tempPhrases t ON p.[Text] = t.[Phrase]
GROUP BY t.[Phrase]
) t
ON p.FeedItem = t.Item
AND p.Phrase = t.Phrase
WHEN MATCHED THEN
UPDATE SET p.[Count] = t.[Count]
WHEN NOT MATCHED THEN
INSERT VALUES (t.[Item], t.Phrase, t.[Count]);
END
and more:
ALTER Function [dbo].[RemoveNonAlphaCharacters](@Temp NVarChar(max))
Returns NVarChar(max)
AS
Begin
SELECT @Temp = REPLACE (@Temp, '%20', ' ');
While PatIndex('%[^a-z ]%', @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex('%[^a-z ]%', @Temp), 1, '')
Return @TEmp
End