views:

251

answers:

3

I have an import between 2 linked servers. I basically got to get the data from a multiple join into a table on my side.

The current query is something like this:

 select a.*
 from db1.dbo.tbl1 a
      inner join db1.dbo.tbl2 on ...
      inner join db1.dbo.tbl3 on ...
      inner join db1.dbo.tbl4 on ...
      inner join db2.dbo.myside on ...

db1 = linked server

db2 = my own database

After this one, I am using an insert into + select to add this data in my table which is located in db2. (usually few hundred records - this import running once a minute)

My question is related to performance. The tables on the linked server (tbl1, tbl2, tbl3, tbl4) are huge tables, with millions of records, and it is slowing down the import process. I was told that, if I do the join on the "other" side (db1 - linked server) for example in a stored procedure, than, even if the query looks the same, it would run faster. Is that right? This is kinda hard to test. Note that the join contains a table from my database too.

Also. are there other "tricks" I could use in order to make this run faster? Thanks

+1  A: 

It is correct to place a stored procedure on db1 to improve performance, this way less data has to go over the pipe, since a lot is filtered in the joins.

If the data on the other side is static, why not place it into a materialized view? This way you only have to update the data once a day instead of each time the query is executed.

Snake
+1  A: 

Stored procedures are cached, hence the first time you run the stored procedure it will take some time. All further calls to that stored procedure will execute a lot faster. You can see the performance impact by including the execution statistic in your SSMS.

To improve the join performance make sure you have indexes in place.

Notice that cross server inserts are dangerous since you are relying on the network. I'm also not sure whether you can use transactions in this scenario. If not, than it's another problem.

I have seen a scenario where temp database wasn't able to cope with the such an insert, and the fix was to use a cursor. This is was much slower, but more reliable for that scenario.

vikp
A: 

It really depends on what your query is actually doing. You can use the "remote" hint on the joins to force the join to happen on the linked server. i.e.:

select a.*  
 from db1.dbo.tbl1 a   
      inner remote join db1.dbo.tbl2 on ...  
      inner remote join db1.dbo.tbl3 on ...  
      inner remote join db1.dbo.tbl4 on ...  
      inner join db2.dbo.myside on ... 

(I assume you left the server out of the above and that all the "db1." references are really "linkedserver.db1".)

If you can do most of the work with just information on the linked server you can use OPENQUERY to speed things up. i.e.:

select a.*  
 from OPENQUERY(db1, 'SELECT a.* from db1.dbo.tbl1 a   
      inner join db1.dbo.tbl2 on ...  
      inner join db1.dbo.tbl3 on ...  
      inner join db1.dbo.tbl4 on ... ') a  
      inner join db2.dbo.myside on ... 

But the best way to speed it up may be having a control table on the linked server to control what to return, but again it depends on your actual query, what it is doing, and what permissions you have on the linked server.

Michael