views:

684

answers:

3

I'll describe what I am trying to achieve:

I am passing down to a SP an xml with name value pairs that I put into a table variable, let's say @nameValuePairs. I need to retrieve a list of IDs for expressions (a table) with those exact match of name-value pairs (attributes, another table) associated.

This is my schema:

Expressions table --> (expressionId, attributeId)

Attributes table --> (attributeId, attributeName, attributeValue)

After trying complicated stuff with dynamic SQL and evil cursors (which works but it's painfully slow) this is what I've got now:

--do the magic plz!

-- retrieve number of name-value pairs
SET @noOfAttributes = select count(*) from @nameValuePairs

select distinct
    e.expressionId, a.attributeName, a.attributeValue
into 
    #temp
from 
    expressions e
join
    attributes a
on
    e.attributeId = a.attributeId
join --> this join does the filtering
    @nameValuePairs nvp
on 
    a.attributeName = nvp.name and a.attributeValue = nvp.value
group by
    e.expressionId, a.attributeName, a.attributeValue

-- now select the IDs I need
-- since I did a select distinct above if the number of matches
-- for a given ID is the same as noOfAttributes then BINGO!
select distinct 
    expressionId
from 
    #temp
group by expressionId
having count(*) = @noOfAttributes

Can people please review and see if they can spot any problems? Is there a better way of doing this?

Any help appreciated!

+1  A: 

This is not a bad approach, depending on the sizes and indexes of the tables, including @nameValuePairs. If it these row counts are high or it otherwise becomes slow, you may do better to put @namValuePairs into a temp table instead, add appropriate indexes, and use a single query instead of two separate ones.

I do notice that you are putting columns into #temp that you are not using, would be faster to exclude them (though it would mean duplicate rows in #temp). Also, you second query has both a "distinct" and a "group by" on the same columns. You don't need both so I would drop the "distinct" (probably won't affect performance, because the optimizer already figured this out).

Finally, #temp would probably be faster with a clustered non-unique index on expressionid (I am assuming that this is SQL 2005). You could add it after the SELECT..INTO, but it is usually as fast or faster to add it before you load. This would require you to CREATE #temp first, add the clustered and then use INSERT..SELECT to load it instead.

I'll add an example of merging the queries in a mintue... Ok, here's one way to merge them into a single query (this should be 2000-compatible also):

-- retrieve number of name-value pairs
SET @noOfAttributes = select count(*) from @nameValuePairs

-- now select the IDs I need
-- since I did a select distinct above if the number of matches
-- for a given ID is the same as noOfAttributes then BINGO!
select
    expressionId
from 
    (
     select distinct
      e.expressionId, a.attributeName, a.attributeValue
     from 
      expressions e
     join
      attributes a
     on
      e.attributeId = a.attributeId
     join --> this join does the filtering
      @nameValuePairs nvp
     on 
      a.attributeName = nvp.name and a.attributeValue = nvp.value
    ) as Temp
group by expressionId
having count(*) = @noOfAttributes
RBarryYoung
@nameValuePairs has no more than 10 rows (worst case scenario - its usually around 5-6)
JohnIdol
also - on a fully populated db #temp has about 9k rows after the filtering
JohnIdol
I have a clustered index on expressions for expressionId (PK), attributeId (PK, FK), a clustered index on attributes for attributeId (PK) - I also added nonclustered index on expression id for (expressionId and attributeId) and another nonclustered index on all fields on the attributes table - as ussgested here --> http://stackoverflow.com/questions/916940/how-to-speed-up-simple-join/917447#917447
JohnIdol
how would I go about using a single query?
JohnIdol
Looking at it again I do notice a couple of things. I will edit my answer to add them. I will do the single-merged query in a few minutes...
RBarryYoung
what version of SQL server is this?
RBarryYoung
SQL2005 - about the #temp --> I need those fields 'cause I need to eliminate duplicates in roder to be able to rely on the @noOfAttributes at the end of the script
JohnIdol
This query wouldn't seem to handle the scenario of where an expression had more than the matching attributes. i.e. expression has attribute ids (1,2,3,4) and the required attributes were (2,3)
CAbbott
+1  A: 

One error I see is that you have no table with an alias of b, yet you are using: a.attributeId = b.attributeId.

Try fixing that and see if it works, unless I am missing something.

EDIT: I think you just fixed this in your edit, but is it supposed to be a.attributeId = e.attributeId?

ryanulit
was a typo! thanks
JohnIdol
+1  A: 

I belive that this would satisfy the requirement you're trying to meet. I'm not sure how much prettier it is, but it should work and wouldn't require a temp table:

SET @noOfAttributes = select count(*) from @nameValuePairs

SELECT e.expressionid
FROM expression e
LEFT JOIN (
           SELECT attributeid
           FROM attributes a
           JOIN @nameValuePairs nvp ON nvp.name = a.Name AND nvp.Value = a.value 
           ) t ON t.attributeid = e.attributeid
GROUP BY e.expressionid
HAVING SUM(CASE WHEN t.attributeid IS NULL THEN (@noOfAttributes + 1) ELSE 1 END) = @noOfAttributes

EDIT: After doing some more evaluation, I found an issue where certain expressions would be included that shouldn't have been. I've modified my query to take that in to account.

CAbbott
thanks for helping - your script is more compact but a bit less readable, you reckon this approach would improve performance?
JohnIdol
yeah, i realize that it's a funky approach, but it did keep it all in one query. performance wise, i'm thinking that not having to create a temp table would be a slight performance improvement. you can always profile them to see if there's a savings.
CAbbott
+1 for your helpful contribution
JohnIdol
Thanks. I enjoy sweeping out the mental cobwebs on these kinds of things.
CAbbott