views:

20

answers:

1

I have a scenario in which i receive files from a source and i blindly dump the data in the files to a staging area DB into number of tables. Now i need to translate the data in the Raw Data tables to a format that my Primary database model would understand and eventually do a move of the translated tables to my Primary Database from the staging area.

For example, i may have to join 3 tables in the raw data stage tables and get a final list of columns to setup a final primary table that is compatible with my primary DB. I may have a lot of rules for translation. Join is one.

So my question is this, What is the best way to do this ? I am planning to have a rule table which may have the Source table, RuleSet, Destination table and construct queries dynamically in a stored procedure that would read the data from the rule table, construct dynamic queries such that the query creates the final primary tables in the format given by the table.

I am looking for better ideas or more design insights on this idea from the experts for the rule table, so as to seamlessly do the translation.

Edit : The idea is i am gonna reuse this DB design for many of our instances. So i am intending to populate the rule table and run the procedure instead just having an ETL process for each of the instances.

+1  A: 

If you have a good understanding of the scope of the possible transformations, you can probably get this to work.

I'm a fan of getting a few (possibly more difficult) examples under my belt first, before attempting to extract a framework, because a framework without good use cases can 1) have unused features, 2) be difficult to configure and be hard to document, 3) spectacularly fail to meet future use cases

That's not to say that a framework cannot be refactored, but that refactoring a framework which you know already meets 80% of your use cases (because you understand the scope) is a lot easier than refactoring one which only meets 10% to one which meets 20% to one which meets 30% etc.

Cade Roux
:) Agree ! The whole model is designed to be that way. I dont have to know whats coming in. The data model i have it designed in the primary DB is the one that follows the rules/ normalised and considered to be a non-repetitive stable one. but the files we receive need not follow the rules and we dont have a lot of control over it. SO basically at the granular level, Transform pathetic data to Good data is the idealogy. The use cases just will keep evolving. The rule table that i ll have is my Use Case table !
Baaju
Choosing the closest answer
Baaju