etl

Recording MySQL DELETE statements

We have a MySQL->Oracle ETL using Informatica that works great for all statements except DELETE. Unfortunately, the DELETE makes the record go away such that Informatica never sees it again to remove/expire it in Oracle. How have people gone about recording MySQL DELETE statements? The tables are InnoDB (ACID-compliant) with unique pr...

How to export text data from a SQL Server table?

I am trying to use the MS SQL Server 2005 Import/Export tool to export a table so I can import it into another database for archival. One of the columns is text so if I export as comma-delimited, when I try to import it into the archive table, it doesn't work correctly for rows with commas in that field. What options should I choose to e...

Data extraction with Excel

I monthly receive 100+ excel spreadsheet from wich i take a fixed range and paste in other spreadsheet to make a report. Im trying to write a vba script to iterate my excel files and copy the range in one spreadsheet, but i havent been able to do it. Is there an easy way to do this? ...

ETL tools... what do they do exactly? In laymans terms please.

Hey guys, I have recently been exposed to some ETL tools such as Talend and Apatar and I was wondering what exactly the purpose/main goal of these tools is in laymans terms. Who primarily uses them and if you use them, how they are (from my understanding) better than just writing some type of scripts. ...

Tracking what the MERGE command and its OUTPUT did

I am modifying a Type 2 dimension using the following (long) SQL statement: INSERT INTO AtlasDataWarehouseReports.District ( Col01, Col02, Col03, Col04, Col05, Col06, Col07, Col08, Col09, Col10, StartDateTime, EndDateTime ) SELECT Col01, Col02, Col03, Col04, Col05, ...

How do I enable logging in RhinoETL process?

I have nearly completed my first ETL process that uses Rhino ETL, and I've been able to figure out the way to use the API by referring to the tests. Great. I have data moving through the pipeline and being written to the db. However I can't seem to figure out how to enable logging. the log4net assemblies are there, the log4net object...

informatica powercenter vs custom perl ETL job?

Most of my company uses powercenter informatica for Extract-Transform-Load type data move jobs between databases. However project I am on has a big custom Perl job with some Java thrown in for good measure to move data and trigger some other updates. There is talk of rewriting the thing to use powercenter instead, what are people's e...

SSIS (missing) Pre-Build and Post-Build

For the warehouse work under progress, we have a single solution with multiple projects in it OLTP Database Project Warehouse Database Project SSIS ETL project After the SSIS project is built, I want to move the binaries (XML, really) from the Bin folder to "C:\AutomatedTasks\ETL.Warehouse\" and "C:\AutomatedTasks\ETL" I cannot find...

ETL as a transaction

For all the ETLs I have written so far, I have never made them transactions - i.e. if table 4 fails, roll everything back. What is the best practice in this regard? To "BeginTran + Commit" or not to "BeginTran + Commit" EDIT: I have one master package calling 4 other packages - is it possible to roll them all up into one transaction? ...

How can I translate these sed and perl one-liners to informatica?

Duplicate: http://stackoverflow.com/questions/1259545/let-me-know-alternate-command-in-dos-for-following-sed-and-perl-commands-closed the following commands have unique implementation in unix box. Need to implement in informatica(etl tool). If not any windows solution for the same sed 's/^#//g' < kam_account_calls.txt > kam_account_...

MapForce vs. Talend Open Studio

We have been using Talend 3.1 for a few months now. However, we are looking at possibly switching to the latest MapForce. Simply because it compiles to a .Net solution and we are otherwise a .Net house. That being said Talend is extremely easy to use and extend. The Talend jobs make it very easy for new developers to understand the job a...

What are the required functionnalities of ETL frameworks ?

I am writing an ETL (in python with a mongodb backend) and was wondering : what kind of standard functions and tools an ETL should have to be called an ETL ? This ETL will be as general purpose as possible, with a scriptable and modular approach. Mostly it will be used to keep different databases in sync, and to import/export datasets ...

Easiest way to import CSV into SQl Server 2005

I have several files about 5k each of CSV data I need to import into SQL Server 2005. This used to be simple with DTS. I tried to use SSIS previously and it seemed to be about 10x as much effort and I eventually gave up. What would be the simplest way to import the csv data into sql server? Ideally, the tool or method would create th...

Problem regarding integration of various datasources

We have 4 datasources.2 datasources are internal and we can directly connect to the database.For the 3rd datasource we get a flat file (.csv) and have to pull in the data.4rth datasource is external and we cannot access it directly. We need to pull data from all the 4 datasources, run business rules on them and store them in our databas...

SQL Server 2005 loading data from an external server

Have a new project with the following setup and requirments:- My client has a MSSQL 2005 server (A) in their office. Their vendor has a MSSQL 2005 server (B) in another part of the world, which contains real-time transactional data. My client wants to load the data from (B) to (A) on a daily basis during non office hours. They have data...

Large scale ETL string lookups performance issues

I have an ETL process performance problem. I have a table with 4+ billion rows in it. Structure is: id bigint identity(1,1) raw_url varchar(2000) not null md5hash char(32) not null job_control_number int not null Clustered unique index on the id and non clustered unique index on md5hash SQL Server 2008 Enterprise Page level compr...

alternatives to SSIS

What is a good,stable and preferably free alternative to SQL Server Integration Services? I'm so tired of this buggy piece of software. ...

Advice on how to write robust data transfer processes?

I have a daily process that relies on flat files delivered to a "drop box" directory on file system, this kicks off a load of this comma-delimited (from external company's excel etc) data into a database, a piecemeal Perl/Bash application, this database is used by multiple applications as well as edited directly with some GUI tools. Some...

ETL Tool for transfering old Firebird Database to a new organized Firebird Database

After looking at a lot of questions..i found no real answer for this. I redisigned an Database for our customer. With Microsoft Access i found a good Tool to get old table Data in my new well formed Database Structure. It is really easy but takes a lot of time (cause handling old Data with a lot of care). Are there any Open Source Tool...

Looking for ETL recommendations

We have a legacy app that needs to be rewritten, and due to its size, we can only do it bit by bit over several years. The idea is to keep developing new features for the new app while still providing access to unimplemented features via the legacy app. We're looking for recommendations for ETL tools that provide a GUI for the actual in...