Done some testing, and the results are pretty strange. If you specify an order by
in both parts of a union, SQL Server gives a syntax error:
select top 2 * from @users where gender = 0 order by id
union all
select top 2 * from @users where gender = 1 order by id
That makes sense, because the order by should only be at the end of the union. But if you use the same construct in a subquery, it compiles! And works as expected:
select * from (
select top 2 * from @users where gender = 0 order by id
union all
select top 2 * from @users where gender = 1 order by id
) sub
The strangest thing happens when you specify only one order by
for the subquery union:
select * from (
select top 2 * from @users where gender = 0
union all
select top 2 * from @users where gender = 1 order by id
) sub
Now it orders the first half of the union at random, but the second half by id. That's pretty unexpected. The same thing happens with the order by
in the first half:
select * from (
select top 2 * from @users where gender = 0 order by id desc
union all
select top 2 * from @users where gender = 1
) sub
I'd expect this to give a syntax error, but instead it orders the first half of the union. So it looks like union
interacts with order by
in a different way when the union
is part of a subquery.
Like Chris Diver originally posted, a good way to get out of the confusion is not to rely on the order by
in a union, and specify everything explicitly:
select *
from (
select *
from (
select top 2 *
from @users
where gender = 0
order by
id desc
) males
union all
select *
from (
select top 2 *
from @users
where gender = 1
order by
id desc
) females
) males_and_females
order by
id
Example data:
declare @users table (id int identity, name varchar(50), gender bit)
insert into @users (name, gender)
select 'Joe', 0
union all select 'Alex', 0
union all select 'Fred', 0
union all select 'Catherine', 1
union all select 'Diana', 1
union all select 'Esther', 1