views:

3599

answers:

6

I added the columns in the select list to the order by list, but it is still giving me the error:

ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

Here is the stored proc:

CREATE PROCEDURE [dbo].[GetRadioServiceCodesINGroup] 
@RadioServiceGroup nvarchar(1000) = NULL
AS
BEGIN
SET NOCOUNT ON;

SELECT DISTINCT rsc.RadioServiceCodeId,
                rsc.RadioServiceCode + ' - ' + rsc.RadioService as RadioService
FROM sbi_l_radioservicecodes rsc
INNER JOIN sbi_l_radioservicecodegroups rscg 
ON rsc.radioservicecodeid = rscg.radioservicecodeid
WHERE rscg.radioservicegroupid IN 
(select val from dbo.fnParseArray(@RadioServiceGroup,','))
OR @RadioServiceGroup IS NULL  
ORDER BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService

END
+4  A: 

While they are not the same thing, in one sense DISTINCT implies a GROUP BY, because every DISTINCT could be re-written using GROUP BY instead. With that in mind, it doesn't make sense to order by something that's not in the aggregate group.

For example, if you have a table like this:

col1  col2
----  ----
 1     1
 1     2
 2     1
 2     2
 2     3
 3     1

and then try to query it like this:

SELECT DISTINCT col1 FROM [table] WHERE col2 > 2 ORDER BY col1, col2

That would make no sense, because there could end up being multiple col2 values per row. Which one should it use for the order? Of course, in this query you know the results wouldn't be that way, but the database server can't know that in advance.

Now, your case is a little different. You included all the columns from the order by clause in the select clause, and therefore it would seem at first glance that they were all grouped. However, some of those columns were included in a calculated field. When you do that in combination with distinct, the distinct directive can only be applied to the final results of the calculation: it doesn't know anything about the source of the calculation any more.

This means the server doesn't really know it can count on those columns any more. It knows they where used, but it doesn't know if the calculation operation might cause an effect similar to my first simple example above.

So now you need to do something else to tell the server that the columns are okay to use for ordering. There are several ways to do that, but this approach should work okay:

SELECT rsc.RadioServiceCodeId,
            rsc.RadioServiceCode + ' - ' + rsc.RadioService as RadioService
FROM sbi_l_radioservicecodes rsc
INNER JOIN sbi_l_radioservicecodegroups rscg 
    ON rsc.radioservicecodeid = rscg.radioservicecodeid
WHERE rscg.radioservicegroupid IN 
    (SELECT val FROM dbo.fnParseArray(@RadioServiceGroup,','))
    OR @RadioServiceGroup IS NULL  
GROUP BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService
ORDER BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService
Joel Coehoorn
If possible, can you elaborate a little more on this?
Xaisoft
Sure- updated the the post.
Joel Coehoorn
Distinct just means to eliminate the duplicates, and once you have done that it does make sense to wantto order the distinct rows in the result
Charles Bretana
I think you missed the point, Charles, but I updated the post to clarify a little.
Joel Coehoorn
+3  A: 

Try this:

ORDER BY 1, 2

OR

ORDER BY rsc.RadioServiceCodeId, rsc.RadioServiceCode + ' - ' + rsc.RadioService
Chris Pebble
Probably a good idea to expand on the why of this - in particular, the fact that the columns aren't distinctly identified when used in a calculated field.
Mike Burton
Curious as to what the snytax ORDER BY 1, 2 does.
Xaisoft
The second one worked fine.
Xaisoft
If it worked, be sure to mark it as an accepted answer
Tom H.
A: 

The reason why I am doing DISTINCT is because when I inner join with sbi_l_radioservicecodegroups rscg, it returns duplicate rows and I need to only show 1 of each.

Xaisoft
+3  A: 

Try oe of these:

  1. Use column alias:

    ORDER BY RadioServiceCodeId,RadioService

  2. Use column position:

    ORDER BY 1,2

You can only order by columns that actually appear in the result of the DISTINCT query - the underlying data isn't available for ordering on.

Tony Andrews
A: 

Chris Pebble's answer worked as did Joel Coehoorn. Which one is actually better? Do Group By clauses slow down queries? How can I pick an answer if more than one works.

With Chris's answer, I avoid the Group By Clause in Joel's answer, but in Joel's answer, I avoid the DISTINCT in the query, but add the group clause. What is the better trade off?

Thanks again.

Xaisoft
Probably the GROUP BY is faster, because it allows the grouping and ordering to take place using the original fields, rather than the calculated field. That could be huge if there's an index involved. (more coming)
Joel Coehoorn
However, there might be some special optimizations it can do only for DISTINCT, and you never really know until you test on a real-world system.
Joel Coehoorn
There is also clarity to consider: in terms of preserving your meaning, the DISTINCT clause seems the better choice.
Mike Burton
I don't know about MSSQL, but in Oracle DISTINCT and GROUP BY use the same execution plan. It's just syntactic sugar.
Mr. Shiny and New
+1  A: 

Distinct and Group By generally do the same kind of thing, for different purposes... They both create a 'working" table in memory based on the columns being Grouped on, (or selected in the Select Distinct clause) - and then populate that working table as the query reads data, adding a new "row" only when the values indicate the need to do so...

The only difference is that in the Group By there are additional "columns" in the working table for any calculated aggregate fields, like Sum(), Count(), Avg(), etc. that need to updated for each original row read. Distinct doesn't have to do this... In the special case where you Group By only to get distinct values, (And there are no aggregate columns in output), then it is probaly exactly the same query plan.... It would be interesting to review the query execution plan for the two options and see what it did...

Certainly Distinct is the way to go for readability if that is what you are doing (When your purpose is to eliminate duplicate rows, and you are not calculating any aggregate columns)

Charles Bretana