views:

44

answers:

2

Hi, I'm new to data warehousing. First, I wan't to precise than my copy of The Datawarehouse Toolkit is on it's way to my mailbox (snail mail :P). But I'm already studying all this stuff with what I find on the net.

What I don't find on the net, however, is what to do when you seems to have more than one fact in a DW. In my case (insurrance), I have refunds that occur on a non regular basis. One client can have none for 3 months and then ten in the same months. On the other hands, I have "subscription fee" (not sure what is the correct english term, but you get the point), that occur every month or every three months. That seems cleraly like two distinct facts to me.

Those too are kind of loosely coupled by some dimensions, like the client or the "inssurance product". Now are these two differents warehouse, on which I have to produce two different report and then connect the reports outside of the DW ? Or is there a way to design this to fit a single descent DW. Or should I combine these two facts in one ? I would probably loose granularity on refunds then.

Some blog I read saids a DW always have one fact table. Others mention the step of designing what are the fact tables with a S, but there is no clear instruction of if there is a link between them or they are just distinct componnents of a same DW project.

Does anyone know some references on that precise part of DW design ?

A: 

Taking your questions backwards.

A data warehouse can have more than one fact table. However, you do want to minimize joins between fact tables. It's ok to duplicate fact information in different fact tables.

Of the objects you mentioned:

Refund is a fact. Timestamp is the dimension of the refund fact.

Subscription fee is a fact. Timestamp is the dimension of the subscription fee fact.

A refund can happen more than once. I'm guessing that each customer has one subscription fee. So it appears we have two fact tables so far, customer, and customer refund.

If you knew that there could only be at the most 3 refunds (as an example), then you would eliminate the customer refund fact table, and put 3 refund columns in the customer table.

You also mention insurance. A customer can have more than one policy. So we have a third fact table.

A data warehouse is usually designed using a star schema. The star schema is basically one fact table connected to one or more dimension tables. You'll probably have more than one star in a data warehouse, since we already defined 3 fact tables.

Gilbert Le Blanc
+1  A: 

You can have as many fact tables as you like. In your example you may have something like: alt text

dimProduct lists several products -- subscription being one of those. dimTransactionType would list possible transactions (purchase, refund, recurring subscription fee ...)

Now suppose you are interested in simplified subscription reporting, you could add a factSubscription like this:

alt text

Damir Sudarevic