tags:

views:

246

answers:

9

Say I have a table that has items and attributes listed like,

frog    green
cat     furry
frog    nice
cat     4 legs
frog    4 legs

From the items column I want to select unique objects that have both the green and 4 legs attribute. I would expect to get back just the frog object in this case. What is the most efficient query to do this?

+1  A: 

select * from table where thing='frog'

nothing beats knowing exatcly what you want.

Otávio Décio
At 1st I thought about that but, I'm not sure the OP is asking "just that". ;) +1 anyway.
Martín Marconcini
ha ha ha. 100% correct and 100% wrong at the same time I suspect
gbn
A: 

Hard because it's not a normalised model. It's a weekend.

You are filtering across multiple, unconnected rows, so you'd have to extract each attribute in turn and then match items.

SELECT
   item
FROM
    (SELECT
        item
    FROM
        Mytable
    WHERE
        attribute = '4 legs') k1
    JOIN
    (SELECT
        item
    FROM
        Mytable
    WHERE
        attribute = 'green') k2 ON k1.item = k2.item
gbn
Why do you say the table not normalized?
Bill Karwin
First impressions... I should have said "harder to use for some kinds of queries"
gbn
KM hit the nail on the head too in comment on OP
gbn
A: 

create two tables, one of items and one of attributes.
Items could be name, intAttributeID, where intAttributeID is a foreign key reference to the Attributes table. That way you can do a select statement based off whatever you care about.

SomeMiscGuy
Think of this table as the intermediate table between a many-to-many relationship between items and attributes.
Jason Christa
+3  A: 
select  item.name 
from    item 
where   item.attribute in ('4 legs', 'green') 
group by item.name 
having  count(distinct item.attribute) = 2
van
+1 for remembering to use DISTINCT. The OP didn't give the actual table definition, so we don't know if a given item/attribute pair must be unique.
Bill Karwin
+2  A: 

The most efficient way to do this is with a self-join:

SELECT * FROM attributes a1 
JOIN attributes a2 USING (item_name) -- e.g. frog
WHERE a1.value = 'green' AND a2.value = '4 legs';

Another solution that some people use is a trick with GROUP BY:

SELECT item_name FROM attributes
WHERE value IN ('4 legs', 'green')
GROUP BY item_name
HAVING COUNT(*) = 2;

But the GROUP BY solution may not be as efficient as a JOIN, depending on which brand of RDBMS you use. Also one method may scale better as the volume in your table grows.

Bill Karwin
Why would you say your first solution is more efficient?
Jason Christa
@Jason: My experience with MySQL is that the GROUP BY solution requires writing an interim result to a temporary table, which usually kills performance. The JOIN solution employs indexes much better. It may even use covering indexes, and never touch the table.
Bill Karwin
Good to know as I am using MySQL. I wonder what would happen if I was querying on 12 values.
Jason Christa
You should make sure to analyze the optimization plan with the EXPLAIN statement, and learn how to read the output. Refer to docs http://dev.mysql.com/doc/refman/5.1/en/explain.html
Bill Karwin
+1  A: 
select
    item, count(*)
from
    @temp
where
    attribute in ('4 legs','green')
group by
    item
having
    count(*) = 2 -- this "2" needs to be replaced with however many attributes you have
jvenema
A: 

But maybe this can help you:

SELECT * 
FROM tbl t1
INNER JOIN tbl t2 ON t1.Name = t2.Name
WHERE t1.Attribute = 'green' AND t2.Attribute = '4 legs'
eKek0
+1  A: 

You could also query each attribute separately, and then intersect them...

/*
-- create sample table...
create table #temp1
    (item varchar(max),
    attrib varchar(max))

-- populate sample table (SQL 08)...
insert #temp1
values ('frog', 'green'), ('cat', 'furry'), ('frog', 'nice'), ('cat', '4 legs'), ('frog', '4 legs')
*/


SELECT  item
FROM    #temp1
WHERE   attrib = 'green'
INTERSECT
SELECT  item
FROM    #temp1
WHERE   attrib = '4 legs'
Scott Ivey
A: 

If possible, I would redesign. This is not something you will ever be able to effectively query 12 values on at the same time on (it will require 12 joins)

Please read this wikipedia article http://en.wikipedia.org/wiki/Entity-Attribute-Value_model#Downsides

Never seen a database yet that used this model that didn't run into serious performance issues eventually. This design looks elegant to non-database people but is actually usually a sign of a badly designed database.

HLGEM