views:

99

answers:

1

I am using the ROW_NUMBER() function introduced in SQL SERVER 2005 to return a paged set of results. The Query Works as expected, but I have one issue. What I would like to do is return the results ordered by the count in descending order. Here is the query and below I will give a little description:

DECLARE @StartRowIndex INT
DECLARE @MaximumRows INT

SET @StartRowIndex = 1
SET @MaximumRows = 10

;WITH Data AS (SELECT
  ROW_NUMBER() OVER (ORDER BY a.column1) as RowNum,
  a.column1, a.column2, dbo.f_GetDataCount(a.column3) as cnt
  FROM dbo.table1 a
  INNER JOIN dbo.table2 b on b.column4 = a.column4
  INNER JOIN dbo.table3 c on c.column5 = a.column5
  LEFT OUTER JOIN dbo.table4 d on d.column6 = a.column6
  WHERE 1=1  AND a.column7 IN (1)
)


SELECT RowNum,column1,
   column2,cnt
FROM
Data
WHERE RowNum BETWEEN @StartRowIndex AND (@StartRowIndex + @MaximumRows) - 1
ORDER BY cnt desc
  1. I know the highest count is in the 100,000+ range.
  2. ORDER By cnt desc does order the results by their count in descending order, but only for the 10 records it fetches. I know this is because RowNum is being ordered by a.Column1 which is not the count. Ideally, I would like to Order RowNum by the Count, so something like this:

ROW_NUMBER() OVER (ORDER BY Count(*)) as RowNum

The above runs, but it takes forever (17+ minutes).

As a side note, dbo.f_GetDataCount(a.column3) as cnt returns the total number of records based on a.column3, so I tried:

ROW_NUMBER() OVER (ORDER BY dbo.f_GetDataCount(a.column3) as RowNum, but this took forever as well.

If I am actually confusing this more than what it is, I will give a little example. If I set the StartRowIndex to 1, it will return 10 records and the first record will have 10,000.

If I then set StartRowIndex to 11, it will return the next 10 records and the first record will have 15,000.

What it actually should return is the record with 15,000 first no matter what the StartRowIndex is.

Help is appreciated very much.

Here is the code for the function:

CREATE FUNCTION [dbo].[f_GetDataCount] 
(
-- Add the parameters for the function here
@column3 nvarchar(10)
)
RETURNS int
AS
BEGIN
-- Declare the return variable here
DECLARE @Result int

-- Add the T-SQL statements to compute the return value here
SELECT @Result = COUNT(a.column3) FROM dbo.table1 a
where a.column3 = @column3

-- Return the result of the function
RETURN @Result

END
+1  A: 

Thanks for posting the UDF code - give this a try vs. using an inline scalar UDF. Should be much faster than ordering by the ROWNUBMER(OVER dbo.fn_GetDataCount()).

If you still have sub-optimal performance, likely that indexing will come into play, or you could look at materializing the count of column3 values intermediately (we can discuss options here if it still needs to be addressed, just repost).

EDIT: Whoops, added the "desc" to the order by in the over clause since you want them descending - also, edited the scalars a bit...

DECLARE @StartRowIndex INT
DECLARE @MaximumRows INT
DECLARE @EndRowIndex INT

SELECT @StartRowIndex = 1, @MaximumRows = 10
SELECT @EndRowIndex = (@StartRowIndex + @MaximumRows) - 1


;WITH Data1 as (
 SELECT a.column3 as c3, count(*) as frequency
 from dbo.table1 a
 group by a.column3
),
Data AS (SELECT
  ROW_NUMBER() OVER (ORDER BY coalesce(d.frequency,0) desc) as RowNum,
  a.column1, a.column2, d.frequency as cnt
  FROM dbo.table1 a
  INNER JOIN dbo.table2 b on b.column4 = a.column4
  INNER JOIN dbo.table3 c on c.column5 = a.column5
  LEFT OUTER JOIN dbo.table4 d on d.column6 = a.column6
  LEFT OUTER JOIN Data1 d
  on a.column3 = d.c3
  WHERE 1=1  AND a.column7 IN (1)
)
SELECT RowNum,column1,
   column2,cnt
FROM
Data
WHERE RowNum BETWEEN @StartRowIndex AND @EndRowIndex
ORDER BY cnt desc
chadhoc
I will try now and let you know. Thanks so far.
Xaisoft
Make sure you get the updated edit with the "DESC" in the over clause...
chadhoc
I didn't refresh the page, so I was scratching my head on why it was showing counts of 1. After adding the desc keyword, it appears it works. It is taking 6 seconds to pull back 10 records, so it is possibly and index issue as well. I have another issue as well, the query I put on here is actually 10 times smaller than the actual query I have which is a dynamic sql query. I am getting an error on it regarding converting to nvarchar. I can update my post if you are willing to take a look at that. Thanks a lot
Xaisoft
Sure, although might make more sense to try to narrow it down a bit and post a different question depending on how close it is to this particular one, whichever seems to make the most sense
chadhoc
The stored proc query actually took 1 minute 30 seconds to return 10 rows after I made the changes. The dynamic query is almost 275 lines long, can I e-mail it to you (quicker for me) and explain a couple of the issues. They are probably no-brainers for you. You also might spot something that will speed it up. If e-mail doesn't work for you, I can always post to another question, but it will take a while.
Xaisoft
Sure, rather than posting contact info here, just go to my profile page (click on my user name in any post) and fill out a contact email on the website listed there (my blog)...
chadhoc
Alright, I will do that now. You should receive an e-mail within the next 15 to 30 minutes depending on how fast I go, lol. Thanks for the help on this one.
Xaisoft
Ok, Chad, I sent the e-mail, let me know if you got it.
Xaisoft