views:

283

answers:

2

I have a little SQL Distinct puzzle that i cannot solve (or at least not in an very elegant way).

I have two tables (try to ignore the simplicity of the example). I'm using MSSQL 2008 if that makes much of a difference.

Table: Category

| categoryId (uniqueidentifier) PK | 
| Name varchar(50)                 |

Table: Download

| downloadId (uniqueidentifier) PK | 
| categoryId (uniqueidentifier) FK | 
| url (varchar(max))               |
| createdate (datetime)            |

I have a few categories in the Category table and potentially a lot of download URLs in the Download table. I'm interested in selecting the newest using the createdate (or a top 5 if that is possible) download url for each category from the Download table.

Currently I'm doing the following, but that is not very nice, and can hardly be the correct way to do it.

SELECT 
   categoryId,
   max(convert(BINARY(16),downloadId)) as downloadId, 
   max(createdate) as createdate 
INTO tmp 
FROM Download
GROUP BY categoryId 
ORDER BY createdate

SELECT url 
FROM Download 
WHERE downloadId IN
      (SELECT CONVERT(uniqueidentifier, downloadId) FROM tmp) 

DROP Table tmp

Any suggestions would be much appreciated.

+2  A: 

To get the latest 5 downloads you should be able to:

SELECT categoryId, downloadId, createdate, url
FROM (
  SELECT
    categoryId, downloadId, createdate, url,
    ROW_NUMBER() OVER(PARTITION BY categoryId ORDER BY createdate DESC) rownum
  FROM Download
) d
WHERE d.rownum <= 5
Peter Lang
Except for the spelling mistake this is perfect. Thank you!I looked into a similar example but thought ROW_NUMBER didnt work in mssql - shame on me.
SimTex
cytegoryId should be categoryId, minor thing people are most likely not using same names.
SimTex
Thanks, I corrected the typo.
Peter Lang
A: 

I don't know if this would be useful to anybody that comes across this or if there is a more efficient way of doing this but I had to select the distinct columns ordered by create date and I managed it by doing the following:

SELECT DISTINCT categoryId, downloadId, url 
FROM ( 
  SELECT 
    categoryId, downloadId, createdate, url, 
    ROW_NUMBER() OVER(PARTITION BY categoryId ORDER BY createdate DESC) rownum 
  FROM Download 
) d 
WHERE d.rownum <= 5

Basically added DISTINCT and removed CreateDate...

If there is a more efficient way of getting that done, please let me know.

Thanks,

Adem