views:

48

answers:

3

Update: After some experimentation, it appears that the thing that is slow is when I call "SELECT min(column) FROM theTableValuedFunction()".

I'll call the local server ServerLocal and the other remote server ServerForeign. I have a SQL query in a table valued function on ServerLocal that looks like this:

SELECT columns
FROM ServerForeign.database.tableA
JOIN ServerForeign.database.tableB ON columns
JOIN ServerForeign.database.tableC ON columns
JOIN ServerForeign.database.tableD ON columns
JOIN ServerForeign.database.tableE ON columns
JOIN ServerForeign.database.tableF ON columns
WHERE conditions

So here's my questions:

When I run this, does ServerLocal transmit the whole query to ServerForeign and ask it to do the join itself and get back the results in one operation, or does it grab entire tables, needlessly transmitting everything only to filter it down/join them on ServerLocal?

The "WHERE conditions" in the table valued function are very general; in various places where the table valued function is called it applies further where conditions. Is there any efficiency to be gained by moving those where clauses up into the table valued function code, or is there no difference?

Would there be significant benefit to putting the table valued function on ServerForeign and calling that from ServerLocal, or will I still incur the same slowness just transmitting the data? If I do move the table valued function to ServerForeign, and ServerLocal applies a WHERE clause at the point where it calls the table valued function, is SQL Server smart enough to transmit that WHERE clause from ServerLocal to ServerForeign and use it to reduce the number of rows transmitted, or does it mindlessly return all of them and drop rows that don't match?

For various external reasons I'd prefer NOT to have to put any code on ServerForeign. If there is a way to run all the SQL code on ServerLocal and still efficiently work with data from ServerForeign, that would be preferable.

+1  A: 

if you do an Execution Plan (Control-L in SQL Server Manager Studio), you'll see exactly what it's submitting. The steps will be called "Remote Query", and if you hover over it, it will show you the remote query that's being sent.

rwmnau
The query plan just says "Remote Query: 100%"
Dennis
But if you hover over the query, it will give you the text it's sending to the remote server. In the case of your example, it would just be a copy of your query, minus the "ServerForeign" name, since all the tables are located remotely. If you were adding any filtering on this site, it would translate that filtering into the remote query when it passes it. Whenever possible, I believe SQL Server will ask for as small a result set as possible to keep network traffic to a minimum, so if it had the option to filter remotely and just get back results, it would choose to do that.
rwmnau
+1  A: 

I've found in circumstances like this, it can be quicker to cache the remote data locally in a temp table, especially if you can filter it using a where clause while pulling it over.

ck
+1  A: 

I ran into this myself. A lot of times, the remote tables will be copied over to the local database to execute the query. Try removing any "where" clause criteria on the remote databases and it should speed up.

Obviously that is not a solution, but most recommend to create a SP or view on the remote database and call that instead.

So in your case if you create a view, ensure the WHERE clause is on the remote server, at least for the columns that are from the remote tables.

Ben Dempsey
I believe when you're aggregating (with MIN, for example), it has to build the whole result set and then aggregate it last. If you have other WHERE criteria on the query, SQL Server will attempt to filter the results on the linked server, so lower the amount of data that's brought back over the network. If the criteria where a standard "WHERE TableB.ID < 10", then I think the query would be quite a bit faster, but SQL just can't do the aggregation without pulling back everything.
rwmnau