data-warehouse

best way to statistically detect anomalies in data

Hi, our webapp collects huge amount of data about user actions, network business, database load, etc etc etc All data is stored in warehouses and we have quite a lot of interesting views on this data. if something odd happens chances are, it shows up somewhere in the data. However, to manually detect if something out of the ordinary ...

Fact Constellation Schema

Hi All, I made a fact constellation schema with 2 fact tables and 16 dimension tables with 4 common dimension tables. One of the dimension table needs to be normalized because data from data source can have variable number of rows. Can I still call it fact constellation schema having a branch in dimension table?? I hope you understand...

TFS Datawarehouse problems

Our install of TFS was done by an outside contractor and was not done well and I am not the Admin on the project so I am limited in what I can do. However, I have found some odd problems that I don't understand. First of all, when I connect to the TFS analysis server I see the TFSdDataWarehouse but only see the main Cube, none of the pe...

Object Database, business intelligence and warehousing

Sorry if this seemed a novice question but I am new to the data warehousing and business intelligence world. From what I have read I can see that a multidimensional database is needed due to limitations of the relational model. Any thing that you need to do with a multidimensional database can be done on an ordinary relational database...

Non-relational database modeling tool?

Hey guys, please recommend some tools you have used succesfully on DW, DataMart, BI an non-relational modeling. Example for automatic creation of snow-flake Schemas, dimensions and facts tables. Wich tools makes you sense familiarity with the diagrams and surrogates keys and it will have the option for export or connect to SQL Server 200...

multiple db connections vs. centralized/redundant db

I have a project to create a dashboard that will connect to existing systems as well as create new features based on combining data from the existing systems. For example, the dashboard will be able to generate "orders" containing data merged from "members" (MS Access DB), "employees" (MySQL DB) and "products" (flat file), and there wil...

Matching only one specific row in a JOIN where many exist

(Advantage Database Server) I have a table of service providers that, for auditing purposes, are never deleted. They have a start date and end date; in the case of changes like name or address, the existing row is end dated, a new row is created, and a new start date is assigned for the changed data. During processing of payments to tho...

Help with database schema for 50+GB DB

Hi all, I have a task to store large amount of gps data and some extra info in database and to access it for reporting and some other non frequent tasks. When I recieve a message from gps device it can have variable number of fields. For example Message 1: DeviceId Lat Lon Speed Course DIO1 ADC1 Message 2: DeviceId Lat Course DIO2 ...

Free data warehousing systems--specifically, for data storage

I am building out some reporting stuff for our website (a decent sized site that gets several million pageviews a day), and am wondering if there are any good free/open source data warehousing systems out there. Specifically, I am looking for only something to store the data--I plan to build a custom front end/UI to it so that it shows ...

Tool To Read SQL query and aggregate data relationships & fields.

I'm looking for a tool that can read multiple SQL queries and aggregate the data dependencies an relationships. For example if I have 20 queries that start with "Select * from employees" I'd expect to see the employees table weighted at the top. I imagine there has to be some data warehousing tool or other product out there to analy...

Oracle data warehouse design - fact table acting as a dimension?

THANKS: Both answers here are very helpful, but I could only pick one. I really appreciate the advice! our datawarehouse will be used more for workflow reports than traditional analytical reports. Our users care about "current picture" far more than history. (though history matters, too.) We are a government entity that does not hav...

MDX performance vs. T-SQL

I have a database containing tables with more than 600 million records and a set of stored procedures that make complex search operations on the database. The performance of the stored procedures is so slow even with suitable indexes on the tables. The design of the database is a normal relational db design. I want to change the database...

SSAS - Custom Total

I'm trying to create a cube where the total is taken from an attribute on the fact and not calculated as an aggregate of the children in SSAS 2005. For example Assume I have 10 stores which all had 100 customers visit on a particular day. There was however some overlap where some customer visited multiple stores. The total visitors f...

Datawarehouse duplicate dimension rows

We're starting to load up a datawarehouse with data from event logs. We have a normal star schema where a row in the fact table represents one event. Our dimension tables are a typical combination of user_agent, ip, referal, page, etc. One dimension table looks like this: create table referal_dim( id integer, domain varchar(255),...

How to improve performance of non-scalar aggregations on denormalized tables

Suppose we have a denormalized table with about 80 columns, and grows at the rate of ~10 million rows (about 5GB) per month. We currently have 3 1/2 years of data (~400M rows, ~200GB). We create a clustered index to best suit retrieving data from the table on the following columns that serve as our primary key... [FileDate] ASC, ...

User-defined ranking / analytic functions in SQL Server 2008

I'm planning a data warehouse migration to SQL Server 2008, and trying to think of ways to replicate the LAG, LEAD, FIRST_VALUE and LAST_VALUE analytic functions from Oracle in SQL Server 2008. They are not included in SQL Server 2008, although the basic machinery for windowed analytic functions is (e.g. ROW_NUMBER, RANK and DENSE_RANK a...

database design suggestion needed

I need to design a table for daily sales of pharmaceutical products. There are hundreds of types of products available {Name, code}. Thousands of sales-persons are employed to sell those products{name, code}. They collect products from different depots{name, code}. They work in different Areas -> Zones -> Markets -> Outlets, etc. {Al...

design of fact table(s) for data warehouse

how would you model this in a data warehouse: there are municipalities which are geographical areas, that exist in geographical hierarchies, such a province (i.e. state, e.g. Minnesota), region (e.g. MidWest). a performance evaluation is done on these municipalities, by calculating performance indicators such as "% of housing backlog...

Is it possible to partially refresh a materialized view in Oracle?

I have a very complex Oracle view based on other materialized views, regular views as well as some tables (I can't "fast refresh" it). Most of the time, existing records in this view are based on a date and are "stable", with new record sets having new dates. Occasionally, I receive back-dates. I know what those are and how to deal wit...

What is Multi Dimention OLAP CUBE and give example cube with more than 3 dimentions

Hi As I am new to SSAS,have been reading an article on Multi-Dimention OLAP Cube and struggling to understand Cube concepts, It has been said that Although the term "cube" suggests three dimensions, a cube can have up to 64 dimensions. Could you please explain how is this possible on cube (other than 3-Dim example x,y,z planes)? Please...