views:

142

answers:

5

I've seen a few different versions of this question but having difficulty applying it to what I need...

MS SQL Server 2008 query:

SELECT Receipts.ReceiptID, Receipts.UserID, Receipts.UserCardID, FolderLink.ReceiptFolderLinkID
FROM dbo.tbl_ReceiptFolderLnk AS FolderLink 
    INNER JOIN dbo.tbl_Receipt AS Receipts ON FolderLink.ReceiptID = Receipts.ReceiptID

**

ReceiptID            UserID               UserCardID           ReceiptFolderLinkID
-------------------- -------------------- -------------------- --------------------
1                    1                    3                    2
2                    1                    3                    3
3                    1                    3                    10
4                    1                    4                    11
5                    1                    4                    13
6                    2                    5                    14
3                    1                    3                    15
4                    1                    4                    16

**

Now I would like to get distinct values on ReceiptID. Using distinct will not work as distinct works on the entire row.

Help would be appreciated.

Thanks!

+1  A: 

Asking for one of many rows with the same receiptID is ambiguous. How do you want to choose that one row from the others?

Ben S
A: 

DISTINCT is distinct for the whole row. Try GROUP BY ReceiptID, of course the values for the other columns may not make sense, because it will just pick one row.

MrZebra
This will fail on all the DBMS I know. All selected columns must be an aggregate function or listed in the group by clause.
Ben S
`@Ben S`: won't fail in `MySQL`, it will just pick any random record from the group.
Quassnoi
Is this randomness cryptographically strong? Because if it isn't, I don't see how this behavior could be useful... ;)
Lucero
`@Lucero`: if you are grouping by a `PRIMARY KEY` in a joined table, there is no point in grouping by other columns to return them: they are guaranteed to be the same.
Quassnoi
+1  A: 

If you want all the results columns in your original query and want each ReceiptID to only show up one time, you'll need to use GROUP BY.

Like: SELECT Receipts.ReceiptID, MIN(Receipts.UserID), MIN(Receipts.UserCardID), MIN(FolderLink.ReceiptFolderLinkID) FROM dbo.tbl_ReceiptFolderLnk AS FolderLink INNER JOIN dbo.tbl_Receipt AS Receipts ON FolderLink.ReceiptID = Receipts.ReceiptID GROUP BY Receipts.ReceiptID

Ben Gribaudo
A: 
SELECT  Receipts.ReceiptID, Receipts.UserID, Receipts.UserCardID, ReceiptFolderLinkID
FROM    dbo.tbl_Receipt AS Receipts
CROSS APPLY
        (
        SELECT  TOP 1 FolderLink.ReceiptFolderLinkID
        FROM    dbo.tbl_ReceiptFolderLnk FolderLink
        WHERE   FolderLink.ReceiptID = Receipts.ReceiptID
        ORDER BY
                FolderLink.ReceiptFolderLinkID
        ) fl

This will eliminate the need for GROUP BY or DISTINCT which may be quite inoptimal.

Also by changing the ORDER BY clause you can choose which FolderLink you want to return if there are muptiple.

Quassnoi
This, unfortunately didn't work - Incorrect syntax near the keyword 'CROSS' and 'ORDER'.
ElHaix
`@ElHaix`: sure, I confused the order of the clauses.
Quassnoi
+1  A: 

I've done this with the MIN or MAX aggregate functions which seems to work pretty well and is far simpler to write than most other methods I've seen.

So it'd be something like this:

SELECT [ColumnA], MIN(ColumnB) AS ColumnB, MIN(ColumnC) AS ColumnC 
FROM MyTable 
GROUP BY ColumnA

In this case ColumnA would be the distinct column.

Steve Wortham
This is what Ben is suggesting, above. Works great!Thanks Guys!
ElHaix