tags:

views:

147

answers:

7

I have a poorly designed table that I inherited. It looks like:

    User  Field   Value
    -------------------
    1      name   Aaron
    1      email  [email protected]
    1      phone  800-555-4545
    2      name   Mike
    2      email  [email protected]
    2      phone  777-123-4567
    (etc, etc)

I would love to extract this data via a query in the more sensible format:

User  Name   Email              Phone
-------------------------------------------
1     Aaron  [email protected]  800-555-4545
2     Mike   [email protected]     777-123-4567

I'm a SQL novice, but have tried several queries with variations of Group By, all without anything even close to success.

Is there a SQL technique to make this easy?

A: 

I believe this will build the result set you're looking for. From there, you can create a view or use the data to populate a new table.

select user, name, email, phone from
(select user, value as name from table where field='name')
natural join
(select user, value as email from table where field='email')
natural join
(select user, value as phone from table where field='phone')
Dolph
+1  A: 

You can use a self join:

SELECT User1.User, User1.Value as Name, User2.Value as Email,
  User3.Value as Phone
FROM Users User1
JOIN Users User2
  ON User2.User = User1.User
JOIN Users User3
  ON User3.User = User1.User
WHERE User1.Field = 'name' AND User2.Field = 'email' AND User3.Field = 'phone'
ORDER BY User1.User

I tested this query, and it works.

Marcus Adams
A: 

In MySQL you can do something like this:

SELECT
    id,
    group_concat(CASE WHEN field='name' THEN value ELSE NULL END) AS name,
    group_concat(CASE WHEN field='phone' THEN value ELSE NULL END) AS phone,
    ...
FROM test
GROUP BY id

The aggregate function actually doesn't matter, as long as you have only one field of each type. You could also use min() or max() instead with the same effect.

Lukáš Lalinský
Note that compared to the join solutions, this requires only a single pass over the whole data set, so it should be more efficient and also the performance doesn't go down as you add more fields.
Lukáš Lalinský
+4  A: 

this not a 'badly designed table'; but in fact an Entity Attribute Value (EAV) table. unfortunately, relational databases are poor platforms to implement such tables, and negate most of the nice things of RDBMS. A common case of using the wrong shovel to nail in a screw.

but i think this would work (based on Marcus Adams' answer, which i don't think would work (edit: now it does))

SELECT User1.Value AS name, User2.Value AS email, User3.Value AS phone
FROM Users User1
LEFT JOIN Users User2
  ON User2.User = User1.User AND User2.Field='email'
LEFT JOIN Users User3
  ON User3.User = User1.User AND User3.Field='phone'
WHERE User1.Field = 'name'
ORDER BY User1.User

Edit: got some niceties from other answers (LEFT Joins, and the field names on the ON clauses), now does anybody know how to put the remaining WHERE a little higher? (but not on the first JOIN's ON, that's too ugly), of course it doesn't matter since the query optimizer uglyfies it back anyway.

Javier
@Javier, thanks, I had actually discovered my mistake and fixed it. I don't know what I was thinking with my first edit. Entity Attribute Value model is probably the most commonly used term.
Marcus Adams
+1 for "wrong shovel to hammer a screw"
Dolph
@Marcus: EAV table! that's the term i was looking for, thanks! @Dolph: it's a lovely image, but i can't take credit for it. First time i saw it was by Roberto Ierusalimschy, main author of the Lua language.
Javier
...and i got it wrong again. The quote is from Russell Keith-Magee, one of the main authors of Django.
Javier
A: 

A variant of Javier's answer, which has my vote.

SELECT 
  UserName.name, UserEmail.email, UserPhone.phone
FROM 
  Users            AS UserName 
  INNER JOIN Users AS UserEmail ON UserName.User = UserEmail.User
    AND UserName.field = 'name' AND UserEmail.field = 'email'
  INNER JOIN Users AS UserPhone ON UserName.User = UserPhone.User
    AND UserPhone.field = 'phone'

Use LEFT JOINs if not all attributes are guaranteed to exist. A composite index over (User,Field) would probably be beneficial for this.

Tomalak
+1  A: 

At my work we are unfortunate to have a database design like this. But this kind of design works better for us then a traditional database design because of the different records we have to store and gives us the flexibility that we need. The database that we are using stores millions of records.

This would be the fastest way to run the query on a large database using MSSQL. It saves from having to do as many joins which could be very costly.

DECLARE @Results TABLE
(
    UserID INT
    , Name VARCHAR(50)
    , Email VARCHAR(50)
    , Phone VARCHAR(50)
)

INSERT INTO @Results
    SELECT DISTINCT User FROM UserValues

UPDATE
    R
SET
    R.Name = UV.Value
FROM
    @Results R
INNER JOIN
    UserValues UV
    ON UV.User = R.UserID
WHERE
    UV.Field = 'name'

UPDATE
    R
SET
    R.Email = UV.Value
FROM
    @Results R
INNER JOIN
    UserValues UV
    ON UV.User = R.UserID
WHERE
    UV.Field = 'Email'

UPDATE
    R
SET
    R.Phone = UV.Value
FROM
    @Results R
INNER JOIN
    UserValues UV
    ON UV.User = R.UserID
WHERE
    UV.Field = 'Phone'


SELECT * FROM @Results
Superdumbell
This is perhaps the more tedious way to do it, but it was very straightforward, let me see progress at intermediate steps throughout, and was not difficult for a beginner. Good solution!
abelenky
A: 

Typical solution to these is a pivot:

SELECT
    [User]
    , MAX(CASE WHEN Field = 'name' THEN Value ELSE NULL END) AS Name
    , MAX(CASE WHEN Field = 'email' THEN Value ELSE NULL END) AS Email
    , MAX(CASE WHEN Field = 'phone' THEN Value ELSE NULL END) AS Phone
FROM
    dbo.UserAttrib
GROUP BY
    [User]
Mark Storey-Smith