views:

95

answers:

3

I want to transfer data from a vertical db layout like this:

---------------------
| ID | Type | Value |
---------------------
| 1  | 10   | 111   |
---------------------
| 1  | 14   | 222   |
---------------------
| 2  | 10   | 333   |
---------------------
| 2  | 25   | 444   |
---------------------

to a horizontal one:

---------------------------------
| ID | Type10 | Type14 | Type25 |
---------------------------------
| 1  | 111    | 222    |        |
---------------------------------
| 2  | 333    |        | 444    |
---------------------------------

Creating the layout is not a problem but the database is rather large with millions of entries and queries get canceled if they take to much time.

How can this be done efficiently (so that the query is not canceled).

A: 

Break it up into smaller chunks and don't wrap the whole thing in a single transaction. First, create the table, and then do groups of inserts from the old table into the new table. Insert by range of ID, for example, in small enough chunks that it won't overwhelm the database's log and take too long.

Brent Ozar
A: 
with t as
(
select 1 as ID, 10 as type, 111 as Value from dual
union
select 1, 14, 222 from dual
union
select 2, 10, 333 from dual
union
select 2, 25, 444 from dual
)
select ID,
max(case when type = 10 then Value else null end) as Type10,
max(case when type = 14 then Value else null end) as Type14,
max(case when type = 25 then Value else null end) as Type25
from t
group by id

Returns what you want, and I think it is the better way. Note that the max function is just here to perform the group by clause, any group function can be use here (like sum, min...)

Scorpi0
Works great, thanks.
OliverS
A: 

The vertical table -- also known as the Entity-Attribute-Value anti-pattern -- always becomes a problem, sometimes very shortly after it is put into practice. If you haven't done so already, check out what Joe Celko has to say about this tactic, and you'll see even more proof of how troublesome this approach is. I'll stop there, since you're the smart person who knew to come to this site, and not the guilty but well-intentioned party who perpetrated the EAV table in your database.

The options for dealing with this type of table are not pretty, and, as you've stated, they get worse/slower as the amount of data needed for production queries grows.

  1. Build a declared global temporary table (DGTT) that is not logged and preserves committed rows, and use it to stage the horizontal version of the EAV table contents. DGTTs are good for this kind of data shoveling because they do not incur any logging overhead.

  2. Employ the traditional CASE and MAX() groupings as shown in the previous recommendation. The problem is that the query changes every time a new TYPE is introduced into your EAV table.

  3. Use DB2's SQL-XML publishing features to turn the vertical data into XML. Here's an example that works with the table and column names you provided:


WITH t(id, type, value) as (
VALUES (1,10,111), (1,14,222), (2,10,333), (2,25,444)
)

SELECT
XMLSERIALIZE( CONTENT
XMLELEMENT(NAME "outer",
 XMLATTRIBUTES(id AS "id"),
  XMLAGG(XMLELEMENT(NAME attr ,
   XMLATTRIBUTES(type as "typeid"), value) ORDER BY type)
) AS VARCHAR(1024)
) 
FROM t as t group by id;


The benefit of the SQL-XML approach is that any new values handled by the EAV table will not require a rewrite to the SQL that pivots the values.

Fred Sobotka