views:

115

answers:

8

I was asked by a customer what the term "data warehouse" really means.

I thought about ETL, details of the data model, differences to NoSQL, Clouds, 'normal' DBMS, MDM (Master Data Management) etc. but wasn't able to describe the term in a few words to him... (In fact I did some talking and left him unillumined.)

How can "data warehouse" described in 1-3 (or a bit more) sentences?

+1  A: 

check this http://www.webopedia.com/TERM/d/data_warehouse.htm

Space Cracker
It's helpful if you provide a summary here in addition to the link
Michael Mrozek
+3  A: 

From wiki:

A data warehouse is a repository of an organization's electronically stored data, designed to facilitate reporting and analysis 1.

This definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform and load data into the repository, and tools to manage and retrieve metadata.

Mitch Wheat
+2  A: 

A data warehouse is an attempt to make disparate systems appear to be homogeneous, regardless to the underlying technology or storage mechanism.

You could get into the 'why' of data warehousing, but thats a different question.

Serapth
+1  A: 

For non technical guys the best is to describe it as "Huge ammount of data stored in a specialized computer system. Data is usually related to some specific domain and whole system is designed to be fast and optimized for some special tasks. Data stored in data warehouses are mostly used for analysis or in decision making processes."

Not sure if this is enough:) There is a lot of references to this subject in the Internet but if somebody asked me for a quick definition I would use something similiar to that I wrote above.

Lukasz Dziedzia
+2  A: 

At least theoretically, the idea of a data warehouse is to provide a consolidated view of data from a variety of existing systems, which are generally considered impractical to rewrite to consolidate the data directly. Therefore, the data warehouse collects data from those existing systems, and provides (at least the illusion of) all the data being in one place, so it can be queried in one way.

The primary intent is (usually) to allow correlation between data from existing systems. For example you can compare how much time your sales person spent with customer X (that's stored in one system) to how much customer X bought (stored in a second system) and how happy customer X is with what they're receiving (stored, of course, in a third system).

From a practical viewpoint, it often means the customer's ideas are somewhere between poorly defined and completely insane. The cost and schedule are next to impossible to even guess at, and a solid estimate is clearly impossible. Delivering what he really wants is almost certainly impossible, and figuring out something that will be useful is going to take enough time and work, that your first step is to make what you're doing sound sufficiently technical that he won't have a heart attack when he gets an inkling of the cost and/or schedule.

Jerry Coffin
A: 

From a practical viewpoint: businesses change, environments change, what was an important question yesterday, may not be today and most probably won't be tomorrow. This is especially true when dealing with questions at the CEO level.

If you can't foresee what the questions will be, your only option is to provide the means to answer any question quickly. That's what data warehouses attempt or purport to do. Where the data comes from, and from how many disparate systems, is immaterial.

Many data warehouses fail in the "answer any question quickly" because their technology requires you to turn the available raw data "inside out" (making cubes) to ensure the "quickly". And defining those cubes restricts the variety of questions that can be answered.

Marjan Venema
+1  A: 

A data warehouse is a database, data load and reporting system designed to aggregate data from multiple sources and present it in a manner which is easy to extract and report on. From a practical standpoint, the benefits of a successful data warehouse project are:

  • Statistical and financial reporting - data warehouses make it easy to work with data in aggregate and get useful analysis from it, particularly where you have 65537 or more rows of data.

  • Data safety - the data is well behaved and doesn't have traps for young players. Ad-hoc reporting systems can be used by inexperienced users with a low risk of producing invalid results in the reports without noticing.

  • Transparency - the business can see and identify issues with data in the underlying systems. A data warehouse can be a good tool to drive data quality work.

  • Empowerment - end-user reporting tools should support the majority of management information requirements with only a minimal set requiring a bespoke report to be developed by a technical specialist.

ConcernedOfTunbridgeWells
+1  A: 

KISS...

A data warehouse is a repository of data related to a given organisation and its activities. This data will allow analysis and reporting of the performance of the organisation across various pertinent dimensions such as time, structure, streams of activity. These dimensions can be combined and results aggregated through relevant hierarchies.

bochgoch