views:

121

answers:

8

Say I have 2 tables:

Person

 - Id
 - Name

PersonAttribute

 - Id
 - PersonId
 - Name
 - Value

Further, let's say that each person had 2 attributes (say, gender and age). A sample record would be like this:

Person->Id = 1
Person->Name = 'John Doe'

PersonAttribute->Id = 1
PersonAttribute->PersonId = 1
PersonAttribute->Name = 'Gender'
PersonAttribute->Value = 'Male'

PersonAttribute->Id = 2
PersonAttribute->PersonId = 1
PersonAttribute->Name = 'Age'
PersonAttribute->Value = '30'

Question: how do I query this such that I get a result like this:

'John Doe', 'Male', '30'

A: 

You need to JOIN the two tables. Wikipedia provides a pretty good explanation of JOIN: http://en.wikipedia.org/wiki/Join_%28SQL%29

Evgeny
+1  A: 

Leaving the design aside, you can always PIVOT the result but you need to know how many attributes you are selecting out in advance.

Don
+3  A: 

I think you need redesign your schema. Why not?

Person

 - Id
 - Name
 - Gender
 - Birthday
...
msi77
+1 Totally agree. It looks like age is being stored in a PersonAttribute->Value string column along with things like 'Male' (and as you point out this should be derived anyway)
Martin Smith
I disagree. The OP has designed a generic attribute schema, perhaps so that users can define their own attributes and assign their own values. His example is a poor one, though.
Tim Drisdelle
StackOverflowNewbie
@stackoverflownewbie - Bad schema is that when you ask help instead of writing a simple query. :-)Your design is your matter, but you'll come across continuous issues.
msi77
@Tim Drisdelle - If you use relational data storage, you need follow the normalization rules, imho. Other question how you map relational model into OP entities.
msi77
@msi77 - normalization is not always the best solution. In this case, it appears that the system is designed to accommodate n number of key-value attribute pairs. This is not an uncommon requirement, and it cannot be represented in a traditional normalization form.
Tim Drisdelle
msi77 is spot-on, this database appears to be another EAV disaster in the making. @StackOverflowNewbie please Google and learn normalization. Or maybe just ask yourself how to write the the query when "John Doe" has 100 attributes, or a thousand.
Brock Adams
@Brock Adams - that's the point... if it's designed this way, it's probably to support a query for when "John Doe" has 100 attributes, all of which have user-generated names and values. Normalization won't work.
Tim Drisdelle
@Tim - What's your suggestion to the original question then?
Martin Smith
+4  A: 
SELECT p.name, p1.Value, p2.Value 
     FROM Person p, PersonAttribute p1, PersonAttribute p2 
     WHERE p.Id = p1.PersonId AND p.Id = p2.PersonId 
        AND p1.Name = 'Gender' AND p2.Name = 'Age'
Amarghosh
This does not scale to pull in N number of generic attributes and their values.
Tim Drisdelle
A: 
SELECT Name, g.Value, a.Value
FROM Person, 
PersonAttribute g INNER JOIN ON g.Name = "Gender", 
PersonAttribute a INNER JOIN ON a.Name = "Age"
mirza
+2  A: 

SELECT p.Name, g.Value, a.Value
FROM Person p INNER JOIN PersonAttribute g ON p.Id = g.Id AND g.Name = "Gender"
INNER JOIN PersonAttribute a ON p.Id = a.Id AND a.Name = "Age"

Chinjoo
+2  A: 

Storing Name Value pairs does give flexibility but is very cumbersome to query. Take a look at http://www.simple-talk.com/community/blogs/philfactor/archive/2008/05/29/56525.aspx

Raju
+1 I always think of that article when I see schemas like that.
Martin Smith
+1 it's a gnarly problem for sure. Conceptually it is fantastic, but implementation is a pain.
Tim Drisdelle
+1  A: 

There's no easy way to do this.

The concept of a pivot table (already mentioned by another answer) is basically what you are looking for, except that pivot tables require you to know the names of the columns you wish to use. Clearly this is a problem when you want to exploit the power of such a table design!

In my previous life, I just settled on X number of columns, like 20-30, and if they didn't exist, then the row set included a bunch of null values. No big deal.

select piv.name, 
    max(case piv.a_name when 'Gender' then piv.a_value else null end) as Gender,
    max(case piv.a_name when 'Age' then piv.a_value else null end) as Age,
    max(case piv.a_name when 'Hobby' then piv.a_value else null end) as Hobby
from 
(select p.name as name, pa.name as a_name, pa.value as a_value 
from person p, personattribute pa
where p.id = pa.personid) piv
group by piv.name

This would generate output like so:

   name    | gender | age |  hobby  
-----------+--------+-----+---------
 Bob Swift | Male   |     | Reading
 John Doe  | Male   | 30  | 
(2 rows)

Which is pretty damned close to what you are looking for. I would leave the rest of it up to your application-layer.

I also highly recommend that you include the attribute NAME as part of the return value, to provide context to the VALUEs.

These types of so-called Entity-Attribute designs often end up having to rely on a combination of server-specific functions, stored procedures, and hard-coded queries.

Tim Drisdelle