etl

Combining two tables with SSIS into one destination table

Hi all, I am new to SSIS, so please bear with me. I created an Integration Services Project for SQL Server 2008 to import data from an old db to a new one. One of the things I need to do is import data from two old source tables into one new destination table. What is the best way to do this? I can easily see the results I want wit...

What books should be on the modern data warehouse architect's shelf?

I have a few on mine. Basically just the set of Kimball books: The Data Warehouse LifeCycle The Data Warehouse ETL Toolkit The Data Warehouse Toolkit Are there any others that should be included? Or any other good ones you know? -mcpeterson ...

Import data from an SSRS report via SSIS package

First, I ask that you not ask 'why.' In the famous words of Tennyson "Ours is not to reason why. Ours is but to do and die." It's one of those, "This is what you have, deal with it." situations. The source data comes from SSRS report. The goal is to load the data into a database via SSIS. The hopeful goal is to avoid human interventi...

ETL , Esper or Drools?

Hello, The question environment relates to JavaEE, Spring I am developing a system which can start and stop arbitrary TCP (or other) listeners for incoming messages. There could be a need to authenticate these messages. These messages need to be parsed and stored in some other entities. These entities model which fields they store. ...

How to extract data from Google Analytics and build a data warehouse (webhouse) from it?

I have click stream data such as referring URL, top landing pages, top exit pages and metrics such as page views, number of visits, bounces all in Google Analytics. There is no database yet where all this information might be stored. I am required to build a data warehouse from scratch(which I believe is known as web-house) from this dat...

Automated ETL / Database Migration Solution

I'm looking for an ETL solution that we can create a configure by hand and then deploy to run autonomously. This is basic transformation, it need not be feature heavy. Key points would be free or open source'ed software that could be tailored more to suit specific needs. In fact, this could be reduced to a simple DB migration tool that ...

Common Lisp condition system for transfer of control

I'll admit right up front that the following is a pretty terrible description of what I want to do. Apologies in advance. Please ask questions to help me explain. :-) I've written ETLs (Extract, Transform, Load) in other languages that consist of individual operations that look something like: // in class CountOperation IEnumerable<...

Mondrian Caché Flushing

Hi folks, in mondrian how can i flush the caché just after my ETL program ends, ETL is running in EJB on the same AppServer where is running Mondrian War. Cheers. ...

How to figure out which record has been deleted in an effiecient way?

Hi, I am working on an in-house ETL solution, from db1 (Oracle) to db2 (Sybase). We needs to transfer data incrementally (Change Data Capture?) into db2. I have only read access to tables, so I can't create any table or trigger in Oracle db1. The challenge I am facing is, how to detect record deletion in Oracle? The solution which I ...

SSIS XMl processing

For my job I do very big imports of (product) data. Recently we started using SSIS and it sure works better then custom .net import tools. Still after 3 projects we figured out it's more efficient to use an scripttask with c# xpath and sql statements then to use XML source and merge joins in a dataflow. Problems with a dataflow Somet...

Importing multiple file types into SSIS / mapping fields

I am working on a new Datawarehouse trying to import a number of different format files from a number of different providers. The filenames may be the same each month, such as MonthlyReturns.xls/.csv, or a pattern, such as NorthWestSalesData20100101.csv). We can't ask the providers to change their naming convention. Do we have to crea...

discovering files in the FileSystem, through SSIS

I have a folder where files are going to be dropped for importing into my data warehouse. \\server\share\loading_area I have the following (inherited) code that uses xp_cmdshell shivers to call out to the command shell to run the DIR command and insert the resulting filenames into a table in SQL Server. I would like to 'go native' an...

Which is better, ETL or ELT?

Having spent some time working on data warehousing, I have created both ETL (extract transform load) and ELT (extract load transform) processes. It seems that ELT is a newer approach to populating data warehouses that can more easily take advantage of cluster computing resources. I would like to hear what other people think the advantage...

Oracle-->SQL - forced conversion from non-unicode to unicode?

I have an ETL that is importing tables from Oracle to SQL 2008 using the OLEDB FastLoad. The data in Oracle is non-unicode. When the table is created in SQL it is created with unicode datatypes. For some reason the datatypes are being forced from non-unicode to unicode. Do any of you know of a way to stop this from happening? Possibly a ...

What are all recommended books for ETL?

I would like to know the famous & recommended books for ETL. I am searching for books which cover patterns, architecture, and handling huge volumes of data in ETL. Could you recommend notable books? ...

SQL Server: unique key for batch loads

Hi, I am working on a data warehousing project where several systems are loading data into a staging area for subsequent processing. Each table has a "loadId" column which is a foreign key against the "loads" table, which contains information such as the time of the load, the user account, etc. Currently, the source system calls a sto...

Access Redis from relational databases

Is there any way to access Redis data from relational databases, such as Oracle or SQL Server? One use case I have in mind is ETL to a data warehouse. ...

ETL Source Target Mapping

Folks, I am looking for a free/open-source alternative for something like IBM FastTrack. Any advice ? ...

postgresql replication + scrubbing

Is there any an easy (built-in, add-on, open-source or commercial) to do replication on Postgresql (Master-slave) to have the data inside the slave be scrubbed for PCI compliance while being replicated across? How about ETL tools? It does not have to be instantaneous ... up to an hour lag is acceptable but the faster the better of cour...

Using REST webservices for ETL / Datawarehousing

Hi, all. Has anyone used a REST-based approach for ETL / Datawarehousing operations? In other words, invoking ETL and OLAP / Database refresh jobs through REST webservices calls: e.g. PUT http://company.com/cube/123523 (to refresh a specific OLAP cube with new data) or POST http://company.com/view/patients/123123 (to create a new da...