tags:

views:

92

answers:

2

I have two tables:

CREATE TABLE EAV (
subscriber_id INT(1) NOT NULL DEFAULT '0',
attribute_id CHAR(62) NOT NULL DEFAULT '',
attribute_value CHAR(62) NOT NULL DEFAULT '',
PRIMARY KEY (subscriber_id,attribute_id)
)

INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (1,'color','red')
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (1,'size','xl')
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (1,'garment','shirt')
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (2,'color','red')
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (2,'size','xl')
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (2,'garment','pants')
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (3,'garment','pants')

CREATE TABLE CRITERIA (
attribute_id CHAR(62) NOT NULL DEFAULT '',
attribute_value CHAR(62) NOT NULL DEFAULT '' )

INSERT INTO CRITERIA (attribute_id, attribute_value) VALUES ('color', 'red')
INSERT INTO CRITERIA (attribute_id, attribute_value) VALUES ('size', 'xl')

To find all subscribers in the EAV that match my criteria, I use relational division:

SELECT DISTINCT(subscriber_id)
FROM EAV
WHERE subscriber_id IN
(SELECT E.subscriber_id FROM EAV AS E
JOIN CRITERIA AS CR ON E.attribute_id = CR.attribute_id AND E.attribute_value = CR.attribute_value
GROUP BY E.subscriber_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM CRITERIA))

This gives me an unique list of subscribers who have all the criteria. So that means I get back subscriber 1 and 2 since they are looking for the color red and size xl, and that's exactly my criteria.

But what if I want to extend this so that I also get subscriber 3 since this subscriber didn't specifically say what color or size they want (ie. there is no entry for attribute 'color' or 'size' in the EAV table for subscriber 3).

Given my current design, is there a way I can extend my query to include subscribers that have zero or more of the attributes defined, and if they do have the attribute defined, then it must match the criteria?

Or is there a better way to design the table to aid in querying?

A: 

It's even easier than your original query.

You basically want to add everyone that matches a criteria, but subtract everyone that has a criteria that doesn't match:

SELECT DISTINCT(E.subscriber_id) FROM EAV AS E
LEFT JOIN CRITERIA AS CR ON (E.attribute_id = CR.attribute_id AND E.attribute_value = E.attribute_value)
WHERE subscriber_id NOT IN (
SELECT subscriber_id FROM EAV AS E
JOIN CRITERIA AS CR ON (E.attribute_id = CR.attribute_id AND E.attribute_value <> CR.attribute_value)
)

To test, I added a few more records:

INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (4,'color','blue');
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (4,'garment','shirt');
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (5,'color','blue');
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (5,'size','xl');
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (5,'garment','shirt');
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (6,'color','red');
INSERT INTO EAV (subscriber_id, attribute_id, attribute_value) VALUES (6,'garment','shirt');

The query returns:

1
2
3
6
Marcus Adams
A: 

SELECT DISTINCT(E.subscriber_id) FROM EAV AS E LEFT JOIN CRITERIA AS CR ON (E.attribute_id = CR.attribute_id AND E.attribute_value = E.attribute_value) WHERE subscriber_id NOT IN ( SELECT subscriber_id FROM EAV AS E JOIN CRITERIA AS CR ON (E.attribute_id = CR.attribute_id AND E.attribute_value <> CR.attribute_value) ) i think this can be simplified

pfg