views:

63

answers:

3

Hello All

I hope you can help find an answer to a problem that will become a recurring theme at work. This involves denormalising data from RDBMS tables to flat file formats with repeating groups (sharing domain and meaning) across columns. Unfortunately this is unavoidable.

Here's a very simplified example of the transformation I'd require:

      TABLE A                               TABLE B
-------------------    1 -> MANY     ----------------------------
  A_KEY    FIELD_A                     B_KEY    A_KEY    FIELD_B
A_KEY_01 A_VALUE_01                  B_KEY_01 A_KEY_01 B_VALUE_01
A_KEY_02 A_VALUE_02                  B_KEY_02 A_KEY_01 B_VALUE_02
                                     B_KEY_03 A_KEY_02 B_VALUE_03

This will become:

A_KEY       FIELD_A      B_KEY1     FIELD_B1     B_KEY2     FIELD_B2
A_KEY_01   A_VALUE_01   B_KEY_01   B_VALUE_01   B_KEY_02   B_VALUE_02
A_KEY_02   A_VALUE_02   B_KEY_03   B_VALUE_03

Each entry from TABLE A will have one row in the output flat file with one column per related field from TABLE B. Columns in the output file can have empty values for fields obtained from TABLE B.

I realise this will create an extremely wide file, but this is a requirement. I've had a look at MapForce and Apatar, but I think this problem is too bizarre or I can't use them correctly.

My question: is there already a tool that will accomplish this or should I develop one from scratch (I don't want to reinvent the wheel)?

A: 

I'm pretty sure you can't solve this in plain SQL, but depending on your RDBMS, it may be possible to create a stored procedure or some such thing. Otherwise it's a fairly easy thing to do in a scripting language. Which technology are you using?

troelskn
For the RDBMS the solution has to be as technology agnostic as possible. I can code this up without coupling my code to a database vendor. It's starting to look like some solutions are almost there, but not all the way.
Anthony
In that case, you can use a generic scripting language, such as Python or similar. It's probably something like 20 lines of code.
troelskn
A: 

Does this help?

using-pivot-in-sql-server-2008

Steve Cooper
Thank you, pivot is close, but slightly different from what I need.
Anthony
A: 

Thanks for all your help. As it turns out the relationship is ONE -> MAX of 3 and this constraint will not change as the data is now static so the following run-of-the-mill SQL works:

select A.A_KEY, A.FIELD_A, B.B_KEY, B.FIELD_B, B2.B_KEY, B2.FIELD_B, B3.B_KEY,
B3.FIELD_B 

from 

 A left join B on (A.A_KEY = B.A_KEY)
left join B B2 on (A.A_KEY = B2.A_KEY and B2.B_KEY != B.B_KEY)
left join B B3 on (A.A_KEY = B3.A_KEY and B3.B_KEY != B.B_KEY 
                   and B3.B_KEY != B2.B_KEY)

group by A.A_KEY
order by A.A_KEY
Anthony

related questions