views:

226

answers:

4

I have data in a mysql table in long / tall format (described below) and want to convert it to wide format. Can I do this using just sql?

Easiest to explain with an example. Suppose you have information on (country, key, value) for M countries, N keys (e.g. keys can be income, political leader, area, continent, etc.)

Long format has 3 columns: country, key, value
  - M*N rows.
  e.g. 
  'USA', 'President', 'Obama'
   ...
  'USA', 'Currency', 'Dollar'

Wide format has N=16 columns: county, key1, ..., keyN
  - M rows
example: 
   country, President, ... , Currency
   'USA', 'Obama', ... , 'Dollar'

Is there a way in SQL to create a new table with the data in the wide format?

select distinct key from table;

// this will get me all the keys.

1) How do I then create the table using these key elements?

2) How do I then fill in the table values?

I'm pretty sure I can do this with any scripting language (I like python), but wanted to know if there is an easy way to do this in mysql. Many statistical packages like R and STATA have this command built in because it is often used.

======

To be more clear, here is the desired input output for a simple case:

Input:

country attrName    attrValue   key  (these are column names)
US  President   Obama   2
US  Currency    Dollar  3
China   President   Hu  4
China   Currency    Yuan    5

Output

country President   Currency    newPkey
US  Obama       Dollar      1
China   Hu      Yuan        2
+1  A: 

If you were using SQL Server, this would be easy using UNPIVOT. As far as I am aware, this is not implemented in MySQL, so if you want to do this (and I'd advise against it) you'll probably have to generate the SQL dynamically, and that's messy.

Mark Byers
A: 

You can do this quite simply by doing :

1 CREATE table new_table(field1 [type], field2 [type], ....) 2 INSERT INTO new_table(field2, field2) SELECT field1, field2 ... FROM old_table;

details at http://dev.mysql.com/doc/refman/5.0/en/create-table.html and http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-select-into-table.html

e4c5
I'm a little confused about this. Using the above example with just president and currency:1)CREATE table new_table(country text, president text, currency text)2)suppose the old_table has three columns: country, keyname, valueWhat would the INSERT INTO statement be?
chongman
You will need to have the keyfield in all your tables otherwise you will not be able to join them together. So your create table might look like:create table new_table(keyname char(20), country_text char(20), president text char(50), currency_text char(20), primary key(keyname));Then INSERT INTO new_table(fieldsnames here) SELECT keyname, country, president_name, currency_text from old_table;Please note that this isn't an optimum database design either. Things like country name are likely to repeat in many places in your tables so you might want to put them in their own table as well
e4c5
Thank you for your help and it was in fact my error not to have a primary key in the original table. I see what you are trying to do, but "president_name" and "currency_text" aren't fields in the old table. Old table just has PrimaryKey, Country, AttributeName, AttributeValue. So I can't do the select as you write it. But let me think about it some more and see if I can get some functional MYSQL code.
chongman
A: 

I think I found the solution, which uses VIEWS and INSERT INTO (as suggested by e4c5).

You have to get your list of AttrNames/Keys yourself, but MYSQL does the other heavy lifting.

For the simple test case above, create the new_table with the appropriate columns (don't forget to have an auto-increment primary key as well). Then

CREATE VIEW a
AS SELECT country, attrValue
WHERE attrName="President";

CREATE VIEW b
AS SELECT country, attrValue
WHERE attrName="Currency";


INSERT INTO newtable(country, President, Currency)
SELECT a.country, a.attrValue, b.attrValue
FROM  a
INNER JOIN b  ON a.country=b.country;

If you have more attrNames, then create one view for each one and then adjust the last statement accordingly.

INSERT INTO newtable(country, President, Currency, Capital, Population)
SELECT a.country, a.attrValue, b.attrValue, c.attrValue, d.attrValue
FROM  a
INNER JOIN b  ON a.country=b.country
INNER JOIN c  ON a.country=c.country
INNER JOIN d  ON a.country=d.country;

Some more tips

  • use NATURAL LEFT JOIN and you don't have to specify the ON clause
chongman
+1  A: 

Cross-tabs or pivot tables is the answer. From there you can SELECT FROM ... INSERT INTO ... or create a VIEW from the single SELECT.

Something like:

SELECT country, 
       MAX( IF( key='President', value, NULL ) ) AS President,
       MAX( IF( key='Currency', value, NULL ) ) AS Currency,
       ...

FROM table 
GROUP BY country;

For more info: http://dev.mysql.com/tech-resources/articles/wizard/index.html

mluebke
My way works. Your way is much better. I LOVE YOU or Thanks. Take your pick on which one you prefer as a way to express appreciation.
chongman