tags:

views:

111

answers:

5

Hi,

If I have a datawarehouse erd - which has the relationsips etc

How will i know what type of join to use , inner, outer left, outer right, full outer,,

surely if the database has been created correctly they all would be inner joins ? bar data quality issues

A: 

No, they won't be all INNER JOINs.

This is only edependent on what you want to accomplish with your query. Every kind of JOIN has its use. I suggest you to read the manual of your database vendor to become familiar with them.

Maximilian Mayerl
+2  A: 

Each type of join does something different, so it all depends on the aim of the individual query.

An example of when you wouldn't want to use an INNER JOIN is if you have a User table containing a list of all your application's users and a Logon table that logs each time someone logs on. If you want to find all users that have not logged on, one way would be to use an OUTER JOIN. This does not mean you have an incorrect database or data quality issues at all.

AdaTheDev
A: 

You will probably still need to use outer joins between facts.

You should definitely be able to inner join to all dimensions though. Unfortunately, we can't do that with the data warehouse at work because we lack a record in most dimension tables to represent null records.

EDIT

Facts are numeric measures (eg. dollars in a financial transaction) while dimensions are descriptors of data (eg. client involved in financial transaction, date of birth of client). You're best off referencing a good data warehousing book. I recommend The Data Warehouse Toolkit by Ralph Kimball.

A fact table contains both facts and foreign keys to dimensions. A dimension table contains the dimension ID, textual descriptions (eg. client name) and possibly foreign keys to other dimensions (eg. date dimension ID for the date of birth field).

Sometimes, a dimension attribute may be logically null. This is not necessarily stored as null, and at my workplace, we use 0 to represent null dimensions. The one that comes to mind at the moment is a death verification method (eg. viewed death certificate) field that we have in our client dimension table. Of course, most of our clients aren't dead, so we populate this field with 0. Our death verification method dimension table, however, only stores actual death verification methods, so it does not have a record with 0 (not applicable) and we are therefore forced to use left joins to it.

lins314159
A: 

So the type of relationship between the tables has absolutely nothing to do with the type of joins to be used?

lins314159 can you explain the facts.. and dimensions... and why?/how? does your datawarehouse lack records that represent nulls in the dimension table??

mike
Mike, this should be posted as a comment rather than an answer.
lins314159
A: 

This is a resource I use whenever I need to refresh my knowledge about what each type of join does.

rosscj2533