views:

273

answers:

3

Hi,

I am using SQL Server 2008 and I have the following SQL script:

Select o.CustomerId as CustomerNoId, OrderValue, OrderDate
From dbo.Orders as o
Inner Join (
    Select Top (10) CustomerId
    From dbo.Customers
    where Age < 60
)
As c
On c.CustomerId = o.CustomerId

This works as desired when used with dbo.Customers and dbo.Orders on the local SQL Server instance. It returns all rows from the orders table for the first 10 customerIds returned from the the Customers table - 1688 rows.

However I have a linked server holding the Customers and Orders tables containing many more rows. When I modify the script to use dbo.Orders and dbo.Customers tables from the Linked Server I get a strange result - It appears the correct data is returned, but only the top 10 rows of it.

I am no SQL expert so I can't figure out why it should behave any differently.

Any suggestions appreciated.

A: 

If you are certain there is actually data in the linked server tables that is not being returned, then I would suspect your query tool. How are you executing the query?

RedFilter
There is data in the linked server tables - if I take away the Top(10) command I am returned millions of rows.I am executing the script via a Query Editor Window in SSMS.
TonE
Are you sure you are linking to tables and not to a view that has a TOP 10 clause in it?
RedFilter
They are tables. If I was linking to a view with a TOP(10) clause then removing the Top(10) clause from my script would still only return 10 rows wouldn't it? As I mentioned my SQL skills are rudimentary at best, so I may have misunderstood your question...
TonE
I wasn't sure what you meant by Top(10) command, thought it could mean the whole subquery...
RedFilter
+2  A: 

Well there is a TOP (10) in your Subquery and no ORDER BY to boot, which means that you are not guaranteed to get the same 10 rows every time (this is especially true with linked servers because of the different algorithms that may be used for collation matching, even if the collations are the same).

Add an ORDER BY clause to the subquery so that you can make that part consistent and stable and the rest may follow correctly.

RBarryYoung
I added an Order By CustomerID and it's now working. Not sure why it works on the local server but not the linked-server, but I guess the moral is always use an Order By clause in conjunction with Top().Many thanks for the help.
TonE
The reason is in RBarry's answer: "which means that you are not guaranteed to get the same 10 rows every time"
Dems
Right, 10 different rows, means that you get completely different matches (or non-matches) on Orders, thus a completely different row count.
RBarryYoung
Yep, I understand that the results returned from Top() without Group By are arbitrary. Just confused me that it worked fine on the local server. That's non-determinism I guess ;-)
TonE
Exactly. :-)
RBarryYoung
A: 

Firstly, your lack of an ORDER BY clause makes your sub-query non-deterministic, as @RBarryYoung pointed out.

Secondly, I would firstly try altering the join order (the sub-query becomes first table_source object for the FROM clause), and if not, try playing with the join hint REMOTE.

Rabid