I have a Kimball-style DW (facts and dimensions in star models - no late-arriving facts rows or columns, no columns changing in dimensions except expiry as part of Type 2 slowly changing dimensions) with heavy daily processing to insert and update rows (on new dates) and monthly and daily reporting processes. The fact tables are partiti...
In my brand new data warehouse that is built (of course) from the OLTP database, I have dropped all the IDENTITY columns and changed them to INT columns.
What are the best practices regarding the following especially since the warehouse is denormalized:
Primary Key
-> this may now be a composite key because several tables have come t...
Can a fact table have no keys at all?
or
if it can, is it a good design? If a fact table do not have any dimensions, on what basis is it analyzed?
What if a fact table has primary key/s only and no foreign key/s?
...
Are there any cases where I can have a textual field such as a description in a fact table?
I currently have a fact table of meeting events (grain: row per meeting) with a number of dimensions such as date, client, location etc. I need to put the meeting subject in the fact table. Is this ok even though it is not a measure (I have not s...
I noticed that the fact tables used in a cube were actually views. Infact they were the templates of the fact tables (i noticed it in the script that "where 1=2" was used for the fact-views).
So, if the template is used, there wont be any data in the view at any cost (and i dont know if I can insert in the view becasue I dont have inse...
I'm creating a DW that will contain data on financial securities such as bonds and loans. These securities are associated with payment schedules. For example, a bond could pay quarterly, while a mortage would usually pay monthly (sometimes biweekly). The payment schedule is created when the security is traded and, in the majority of case...
I am interested in learning more about data warehousing. I see terms like "dimension", "snowflake schema" and "star schema" thrown about. Where would one start in learning about this stuff? Are there good books or Internet resources?
ETL is in this space too right?
...
I'm not sure "Metadata Management" is the right term....
Basically, I have a client who asked for recommendations on "Metadata Management" tools with regard to a data warehousing project they have. I'm guessing the term has to do with creating something like a data dictionary, but I have relatively little experience in this area and am...
I have implemented delta detection while loading data warehouse from transaction systems using an identity column or date-time column in source transaction tables. When data needs to be extracted next time, the maximum date-time value extracted last time is used in the filter of extraction query to identify new or changed records. This w...
I have a fact table that has 17 keys. Normally I have been designating the primary key as all of my dimensional keys. MS SQL server 2008 has a limitation of 16 columns in a primary key or unique constraint. Are there any work arounds?
...
In a data warehouse, are there disadvantages to creating clustered indexes on fact tables? (most of the time, it will be on the datetime column)
Would you answer yes or no "by default..."?
If I shouldn't create clustered indexes by default, then why? (I know the pros of clustered indexes, but what are some cons?)
References
http://...
Are there any alternatives to SQL Server Analysis Services on the Windows x64 platform? I'm vaguely curious, because I haven't heard of any (though admittedly I haven't looked very hard).
Basically, a product that allows multi-dimensional cubes and querying those to generate reports (though the generation and presentation of those repo...
I have the following table lookup table in OLTP
CREATE TABLE TransactionState
(
TransactionStateId INT IDENTITY (1, 1) NOT NULL,
TransactionStateName VarChar (100)
)
When this comes into my OLAP, I change the structure as follows:
CREATE TABLE TransactionState
(
TransactionStateId INT NOT NULL, /* not an IDENTITY column i...
I am modifying a Type 2 dimension using the following (long) SQL statement:
INSERT INTO AtlasDataWarehouseReports.District
(
Col01,
Col02,
Col03,
Col04,
Col05,
Col06,
Col07,
Col08,
Col09,
Col10,
StartDateTime,
EndDateTime
)
SELECT
Col01,
Col02,
Col03,
Col04,
Col05,
...
I am creating a calendar table for my warehouse. I will use this as a foreign key for all the date fields.
The code shown below creates the table and populates it. I was able to figure out how to find Memorial Day (last Monday of May) and Labor Day (first Monday of September).
SET NOCOUNT ON
DROP Table dbo.Calendar
GO
Create Table d...
For my data warehouse, I am creating a calendar table as follows:
SET NOCOUNT ON
DROP Table dbo.Calendar
GO
Create Table dbo.Calendar
(
CalendarId Integer NOT NULL,
DateValue Date NOT NULL,
DayNumberOfWeek Integer NOT NULL,
NameOfDay VarChar (10) NOT NULL,
NameOfMonth VarChar (10) NOT NULL,
WeekOfY...
To preface this, I'm not familiar with OLAP at all, so if the terminology is off, feel free to offer corrections.
I'm reading about OLAP and it seems to be all about trading space for speed, wherein you precalculate (or calculate on demand) and store aggregations about your data, keyed off by certain dimensions. I understand how this wo...
As my Rails app matures, it's becoming increasingly apparent that it has a strong data warehouse flavour, lacking only a facts table to make everything explicit.
On top of that, I just read Chapters 2 (Designing Beautiful APIs) and 3 (Mastering the Dynamic Toolkit) of Ruby Best Practices.
Now I'm trying to figure out how best to design...
I work at a large university and much of my department's backup requirements are provided by central network services. However, many of the users have collections of large files such as medical imaging scans, which exceed the central storage available to them.
I am seeking to provide an improved backup solution for departmental resource...
Mulling over something I've been reading up on.
According to Chris Webb,
A linked measure group can only be
used with dimensions from the same
database as the source measure group.
So I took this to mean as long as two cubes share a database, a linked measure group can be used with a dimension. So I created a new cube and ad...