etl

How do I keep a table synchronized with a query in SQL Server - ETL?

I wan't sure how to word this question so I'll try and explain. I have a third-party database on SQL Server 2005. I have another SQL Server 2008, which I want to "publish" some of the data in the third-party database too. This database I shall then use as the back-end for a portal and reporting services - it shall be the data warehouse. ...

Move data from one database to other with different data structure.

How to move data from suppose mysql database to postgres database? Scenario: Two similar application. A user wants to switch from one application to other. But he had maintained certain data information in his previous appilaction which uses mysql database.When he switch his appliaction he has to move his data from his old application t...

Where is Pentaho Kettle's architecture?

Where can I find Pentaho Kettle architecture? I'm looking for a short wiki, design document, blog post, anything to give a good overview on how things work. This question is not meant for specific "how to" starting guides but rather a good view at the technology and architecture. Specific questions I have are: How does data flow betwe...

Problem with Rhino-Etl and MySQL

I've been using Rhino-ETL for a little while and it's running pretty smooth. However I have a problem connecting to my MySQL DB. Rhino.Etl.Core.RhinoEtlException: Failed to execute operation Hornalen.Migration .Process.ReadMessagesFromDb: The type name 'MySql.Data.MySqlClient' could not be found for connection string: testConnectionStr...

Alternative for look up task in SSIS

i am working on a SSIS solution for datawarehouse for extracting Surrogate keys of corresponding application keys, I am using look up task of SSIS but the problem with this task is it caches the complete look up table in its memory . And my look up table size is huge i.e. 20 million records. So if u can suggest some ways or alternatives ...

ETL Tools and Build Tools

I have familiarities with software automated build tools ( such as Automated Build Studio). Now I am looking at ETL tools. The one thing crosses my mind is that, I can do anything I can do in ETL tools by using a software build tool. ETL tools are tailored for data loading and manipulation for which a lot of scripts are needed in order...

Best practices when migrating data from one database scheme to another?

Often times when I am working on a project, I find my self looking at the database scheme and having to export the data to work with the new scheme. Lots of times there has been a database where the data stored was fairly crude. What I mean by that is that its stored with lots of unfiltered characters. I find my self writing custom php...

Is Pentaho ETL and Data Analyzer good choice?

I was looking for ETL tool and on google found lot about Pentaho Kettle. I also need a Data Analyzer to run on Star Schema so that business user can play around and generate any kind of report or matrix. Again PentaHo Analyzer is looking good. Other part of the application will be developed in java and the application should be datab...

Is it possible to create a SQL Server native file from c# (like BCP native format)

We are upgrading a 15 year old code-base, there is a requirement to create some native BCP formatted data files. In the new system, we would ideally like to utilize data in a C# DataTable object to create the data file in native BCP format. Can this be done and if so, what would be the best approach? ...

Howto visually design a mashup query for programatic extraction

I'm into development of an application that fetches various inputs from internet pages whereas each information snippet comes from a different location (mashup). I would like to generate the mashup building block (snippets) through a visual tool. Do you know of anything similar that can be used for such a project? (Already made control...

SSIS Package failing with New structure of Flatfile

Hi SSIS package is just importing from txt file to sql database. when we made the package were using old file and its executing fine.the old source file got (10 columns) the new source file got 15 columns. when the source file changed its failing. [Flat File Source [1]] Error: Data conversion failed. The data conversion for column "Co...

What is the best file parsing solution for converting files?

I am looking for the best solution for custom file parsing for our enterprise import routines. I want to basically change one file format into a standard file format and have one routine that imports that data into the database. I need to be able to create custom scripts for each client since its difficult to get the customer to comply w...

SQL Server 2005 SSIS Checksum Package

Folks, We're building an ETL process to load mid-size dimensional data warehouse using SQL Server 2005 SSIS on 64bit OS. We're planning to use SSIS's Checksum package to manage SCDs (Slowly Changing Dimensions). Even though we're doing a proof of concept using SSIS Checksum package, I'm not comfortable using it in real production scen...

Is Web Service suitable for ETL purpose?

Hi, My company is considering using web service as mean of ETL process. However I don't think web service fit into this purpose, for several reasons: 1. web service could possibly consume a lot of memory when generating large xml. 2. xml is a bloated format. 3. possibly time-out if the server takes huge amount of time to generate data 4...

DTS vs. SSIS vs. Informatica vs. PL/SQL Scripting

In the past, I have used Informatica for some ETL (Extraction Transformation Loading) but found it rather slow and usually replaced it with some PL/SQL scripts (was using Oracle at the time). (questions revised based on feedback in answers) I gather that DTS was Microsoft's ETL tool prior to SSIS. Would it be difficult to convert an ...

For developers, is it worth it to learn/use SSIS?

I'm starting to get involved in quite a bit of ETL work a my current job, and everyone seems to be pretty partial to SSIS. I'm struggling trying to do the most trivial transformations through BI studio that would usually equate to a couple foreach loops with a pinch of LINQ. Im not sure of the use cases or users this tool would be usef...

ETL recommendation for migrating PK/FK values? Easy is better than powerful.

What I need from the tool is fairly simple, and this is a one-time effort, so maintainability isn't a priority. We are just going to take 14 separate MySQL database instances and need to combine the data into one database. There are some 200 tables involved, so we need something that can ease the pain of letting db rows get assigned a ...

Data in different resolutions

I have two tables, records are being continuously inserted to these tables from outside source. Lets say these tables are keeping statistics of user interactions. When a user is clicking a button the details of that click (the user, time of click etc.) is written to one of the tables. When a user mouseovers that button a record is added ...

Desining a system for mutiple input/multiple output

Background I'm sketching on an application that needs to perform something like this database a >--| |--> fileformat 1 database b >--+--> custom application >--+--> fileformat 2 ... | | ... database n >--| |--> fileformat n The databases in questi...

Unit testing large blocks of code (mappings, translation, etc)

We unit test most of our business logic, but are stuck on how best to test some of our large service tasks and import/export routines. For example, consider the export of payroll data from one system to a 3rd party system. To export the data in the format the company needs, we need to hit ~40 tables, which creates a nightmare situation...