ansaurus

Question

Cross Database Unions

Answer 1

+3 A:

Yes, you definitaly should. There is no point in splitting the same table in different databases. If you have problems with hard disk space, think about partitioning your table.

Regarding your comment:

The performance cost isn't THAT huge, but a union performs a merge join, which brings a bit ov an overhead.

In addition to that, are you sure you are using UNION correctly? UNION will eliminate duplicate values. Maybe what you really want to do is UNION ALL?

Maximilian Mayerl 2009-12-01 13:21:42

Can you help me understand why the processing would be faster from one table rather than 3 tables that are in the UNION?

Raj More 2009-12-01 13:28:36

UNION ALL is indeed what I want. Thank you!

Raj More 2009-12-01 14:01:45

Answer 2

+3 A:

Consider federating your table using SQL Server's first class partitioning (as opposed to doing it yourself). If you're always select every data point, then maybe yeah, pulling off multiple disks is faster.

But why have multiple databases? You could always stack the three tables into one table, but have that one table implemented atop three drives RAIDed together. This is a more clear cut solution if what you're after is speed.

The federating only makes sense if you're ever selecting specific, adjacent parts of the set. But according to your OP, you're selecting everything, so that eliminates that benefit.

Mark Canlas 2009-12-01 13:45:18

I am doing partitioning in my cube where the reports run from.

Raj More 2009-12-01 14:03:53

If you're doing reporting, why do you mention that you're UNIONing the tables together? When are you ever querying for every single data point?

Mark Canlas 2009-12-01 14:11:07

I query everything from the fact tables in SQL Server and my SSAS cube holds all the aggregates required for reporting. I must be missing the thrust of your question, though. Please elaborate

Raj More 2009-12-01 15:09:29

Answer 3

+1 A:

A UNION essentially performs a select distinct so that it can remove duplicate records. This will (potentially) impair performance all by itself. You could rectify that by changing it to UNION ALL.

Beyond that, you should check the execution plan to see what kind of performance hit you're getting. I do know that SQL Server will use indexes even on tables in other databases, but what you're doing here still doesn't make a lot of sense. I'd suggest using table partitioning and keeping it all in one database if the normal usage scenario is to UNION all of the tables together, there is little if any good reason to separate them into different databases.

Donnie 2009-12-01 13:55:44

I will change this to a `UNION ALL`.

Raj More 2009-12-01 14:04:27

Answer 4

+2 A:

The cross-database query is (somewhat) slower than than in-database, If you need three separate tables, it is advisable to use different schemas inside the same db. If this happens to be one fact table, then it is best to load into a a single fact table and use partitioning if the size happens to be too big.

When it comes to ETL, the suggestion is to ETL into staging tables in a separate schema -- like ETL -- within same DB and load the fact table from there. Once done, you truncate staging tables.

The recommendations are from the Microsoft Project Real.

Also keep in mind that foreign key can not be used across databases.

Damir Sudarevic 2009-12-01 14:03:45

This is not a cross database JOIN - which I can understand would be slower. This is concatenating tables together. Since this is a warehouse, the ETL takes care of validating data so there are no FKs and no constraints.

Raj More 2009-12-01 15:10:48

+1 for Project real (wish I could +2)

Raj More 2009-12-01 15:11:21

ansaurus

tags:

views:

answers:

Cross Database Unions

related questions