views:

50

answers:

2

I consider myself rather proficient with T-SQL and I'm usually able to optimize a query pretty good without loosing readability. In short: I like my SQL short, descriptive, declarative and elegant.

While the following code works, i have two problems with it:

  1. I am using cursors and I can't shake the feeling I have in the back of my head that it could have been done more efficiently using CTEs. Also cursors don't work in views, so I can't manipulate results/ranges on the client-side or in dependent SQL.
  2. The code is implemented in a Stored Procedure, which leads to the same problem as above. Especially with LInQ to SQL and auto-paging.

So given the following SP, does anyone see any obvious way to convert this to a plain select using recursive CTEs? I've tried, failed and thought I'd see what the stack overflow community might be able to come up with.

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROC [dbo].[usp_GetLastReferers]
(
 @Limit int = NULL
)
AS
BEGIN
 SET NOCOUNT ON

 CREATE TABLE #Referer
 (
  ID   int,
  Url   nvarchar(500),
  Referer  nvarchar(500)
 )

 DECLARE @ID   int
 DECLARE @Url  nvarchar(500)
 DECLARE @Referer nvarchar(500)
 DECLARE @Count  int
 SET @Count = 0

 DECLARE LogCursor CURSOR FORWARD_ONLY READ_ONLY FOR
 SELECT ID, Url, Referer FROM Log WHERE Referer <> '' ORDER BY ID DESC

 OPEN LogCursor

 FETCH NEXT FROM LogCursor INTO @ID, @Url, @Referer

 WHILE @@FETCH_STATUS = 0 AND (@Count < @Limit OR @Limit IS NULL)
 BEGIN
  DECLARE @Hits int
  SELECT @Hits = COUNT(*)
  FROM #Referer
  WHERE Referer = @Referer

  DECLARE @IsLocal bit
  SELECT @IsLocal = dbo.IsLocalSite(@Referer)

  IF (@Hits = 0 OR @Hits IS NULL) AND @IsLocal = 0
  BEGIN
   INSERT INTO #Referer(ID,Url,Referer) VALUES (@ID,@Url,@Referer)
   SET @Count = @Count + 1
  END

  FETCH NEXT FROM LogCursor INTO @ID, @Url, @Referer  
 END

 CLOSE LogCursor
 DEALLOCATE LogCursor

 SELECT *
 FROM #Referer

 DROP TABLE #Referer

 SET NOCOUNT OFF
END

Since it may not be totally obvious, what I'm trying to do here is akin tothe following quasi SQL

SELECT DISTINCT TOP(@Limit) ID, Url, Referer
FROM Log 
ORDER BY ID DESC

Basically to get the last unique refers (not unique rows), which often contain duplicates, and in descending order. This is definitely where it gets tricky.

The data is pretty simple HTTP logs. ID field is just a unique row-identifier, Url is the full url requesten, and Referer is the HTTP referer for that request. None of the values can be null, but referer can be empty (ie ''). IsSiteLocal is just a simple filtering function to exclude referers originating from my own sites.

If anyone wants some sample-data to full around, I can upload a small DB-backup so you have something to fool around with.

Sample-data can be found here: http://svada.kjonigsen.net/files/IISLogsDBBackup.zip

A: 

Try this:

 ;with Referers as (
  SELECT 
  row_number() over (order by id desc) rn 
  ,ID, Url, Referer FROM Log 
  WHERE dbo.IsLocalSite(Referer) = 0 
) 
select * from Referers
where rn <= @limit
Denis Valeev
Thanks for the reply, but it kinda misses. I'll update the question to be more descriptive.
Jostein Kjønigsen
A: 

you want the max(ID) for each Url,Referer as long as dbo.IsLocalSite(@Referer) = 0? Can you just group by Url,Referer to get the max(ID) and apply your function in the WHERE clause?

Beth