views:

123

answers:

3

Given a table Records with the columns id int, Type int and Name varchar(50), I can make search queries like so:

SELECT id, Type, Name
FROM Records
WHERE Name LIKE '%Foo%'

To tweak the performance a little, I'd like to give only a limited amount of results; currently 100 — simply by adding TOP 100 to the statement. This, however, can cause records of some types to be underrepresented, or not represented at all, as shown by the following query:

SELECT Type, COUNT(Type) FROM
 (SELECT id, Type, Name
 FROM Records
 WHERE Name LIKE '%Foo%') x
GROUP BY Type
ORDER BY Type

Without the TOP 100, I might get:

42 5
49 1
50 1
52 1
59 1
76 40
87 567
90 3

…and with it:

42 5
49 1
50 1
52 1
59 1
76 26
87 65

This could lead the user to conclude that no record of type 90 exists.

I'd prefer TOP to behave differently: give me at least one result of any type for which there are some, then keep adding to them until the count is reached. E.g., 42, 76 and 87 would have fewer results, but 90 would show up.

Ideally, I'd also like to provide the user with a "x more results of this type" UI element.

Do I have to forego TOP altogether to accomplish this?

A: 

You could do two calls. The first would be a [type] and count([type]). This would provide both a unique list of all available types as well as a count of them for more information to the user. After that, do calls per type as the user requests them.

Michael Todd
+1  A: 
WITH cte AS (
SELECT id, Type, Name,
   ROW_NUMBER() OVER (PARTITION BY Type ORDER BY Type) as rn
        FROM Records
        WHERE Name LIKE '%Foo%') 
SELECT *
   FROM cte
   WHERE rn <= 100;

This will get at most 100 records from each Type. Note that this by no means implies the query will be faster, in fact it may well be slower. It depends on the available indexes, the data in the tables, and the speed at which the client is able to process the result.

If you can to show the total count of each type, then you must compute it:

WITH totals AS (
    SELECT Type, COUNT(*) AS count
            FROM Records
            WHERE Name LIKE '%Foo%'
     GROUP BY Type) 
SELECT * FROM totals;

and then join the two results:

WITH cte AS (
    SELECT id, Type, Name,
       ROW_NUMBER() OVER (PARTITION BY Type ORDER BY Type) as rn
            FROM Records
            WHERE Name LIKE '%Foo%')
, totals AS (
        SELECT Type, COUNT(*) AS count
                FROM Records
                WHERE Name LIKE '%Foo%'
         GROUP BY Type) 
    SELECT *
       FROM cte c 
       JOIN totals t on c.Type = t.Type
       WHERE c.rn <= 100;
Remus Rusanu
Your code actually returns too many rows (152 in my case), and appears to be slower than Shannon's solution (700 ms exec. time vs. 348.67 ms, averaged over three runs) but I upvoted you anyway because it's pretty close. :)
Sören Kuklau
+3  A: 
WITH RecordsWithRn AS (
SELECT id, Type, Name,
   ROW_NUMBER() OVER (PARTITION BY Type ORDER BY ... intra-type ordering ...) as rn
        FROM Records
        WHERE Name LIKE '%Foo%') 
SELECT TOP 100 id, Type, Name
   FROM RecordsWithRn
   ORDER BY RN, ... inter-type ordering ...

This gives you one hundred records. You will have at least one of each type, assuming less than one hundred types. Use the ORDER BY for ROW_NUMBER() to control the order of ROW_NUMBER of records within a type. The final ORDER BY, orders by the earlier assigned row_number and then add other criteria, if you wish, to control the order of records between types at each row_number.

EDIT: To get the number of records of a type not shown:

WITH RecordsWithRn AS (
   SELECT id, Type, Name,
      ROW_NUMBER() OVER (PARTITION BY Type ORDER BY Type) as rn,
      COUNT(*) OVER (PARTITION BY Type) as CountType
   FROM Records
   WHERE Name LIKE '%Foo%')
, Top100Records as (
   SELECT TOP 100 id, Type, Name, CountType
   FROM RecordsWithRn
   ORDER BY RN)
select Id, Type, Name, 
    CountType - (COUNT(*) over (PARTITION BY Type)) as CountTypeNotIncluded
from Top100Records
Shannon Severance
That second solution nails it.
Sören Kuklau