views:

66

answers:

5

MY table looks like this

id | string | foreign_id
---------------------------
1  | house  | 5
2  | garden | 6
3  | window | 5
...

I have an array of strings and i want to get all foreign ids which match all elements in the array. So i have an array like this [house, window] then i want to get 5. The array of strings can have up to 10 elements.

How does the appropriate sql statement look like?

A: 
select string, foreign_id
from table
group by foreign_id
James Curran
+1  A: 

Assuming you want foreign_id that match both "house" and "window", use:

  SELECT t.foreign_id
    FROM YOUR_TABLE t
   WHERE t.string IN ('house', 'window')
GROUP BY t.foreign_id
  HAVING COUNT(DISTINCT t.string) = 2

The HAVING count must equal the number values defined in the IN clause.

The array of strings can have up to 10 elements.

That will require using dynamic SQL. I'd provide an example, but you didn't mention what database you're using...

OMG Ponies
This will match a foreign_id that has only a corresponding "house", but not "window".
Mark Peters
Looks better now.
Mark Peters
@Mark Peters: Thank you - I misread it earlier, sorry.
OMG Ponies
A: 

Generate the query dynamically, adding the noted line once for each array element:

Select distinct foreign_id
 from myTable
where 1=1
  --add the following line once for each array element
and foreign_id in (select foreign_id from myTable where string = 'value')

So given your example, the query would be

Select distinct foreign_id
 from myTable
where 1=1
and foreign_id in (select foreign_id from myTable where string = 'house')
and foreign_id in (select foreign_id from myTable where string = 'window')

This should return what you're looking for.

Any questions?

// the 1=1 is just so that the dynamic part is the same each time.. I would exclude the 1=1 and have the first entry skip the 'and'

Fosco
Valid, but overcomplicated
OMG Ponies
Thank you for your subjective analysis.
Fosco
+1  A: 

Try something like:

select foreign_id 
from your_table
where string in ('house', 'window')
group by foreign_id
having count(distinct string) = 2;

- assuming that whatever generates the query can count the number of distinct strings requested.

(Edited following comments)

Mark Bannister
`HAVING COUNT(*)` risks false positives. IE: A foreign_id of 5 with two records where the string value is "house" would be valid.
OMG Ponies
There's also false negatives as I said in my post. A foreign_id of 5 with two records where the string value is "house", plus one record where the string value is "window" would be invalid.
Mark Peters
Sorry, should have been having count(distinct string) = 2 - have now edited answer.
Mark Bannister
Thanks, this work fine for me. You realy helped me out!
Stephan
A: 

@Mark Bannister was close, but his will fail when there are duplicates of string and foreign_id. This works, but it's been years since I did anything useful with SQL so there's probably a better way.

SELECT foreign_id FROM (
   SELECT distinct string, foreign_id FROM your_table
      WHERE string in ('house', 'window')) as T  
GROUP BY T.foreign_id HAVING count(foreign_id) = 2;

Like Mark, I assume that you can dynamically populate the in clause as well as the RHS of the having clause. A stored procedure might help here.

Mark Peters
Mark Bannisters answer is better - it will use only one pass over the table. There's no need for the derived table/inline view... Also, the HAVING COUNT should on distinct values -- not the foreign_id column ;)
OMG Ponies
Well the having clause works fine **because** I use the sub-select. I see you've updated your answer, which looks right now.
Mark Peters
@Mark Peters: You're missing the point - there's no need to return the `string` column (besides that DISTINCT in a GROUP BY is useless) in the inner query. This all can be done without the derived table. And your COUNT is still referencing the wrong column...
OMG Ponies
No I'm not missing the point. I understand it could have been done without the nested select, but since you already refined it in your own answer I saw no need to do so here. I don't have a DISTINCT in a group by so I'm not sure what you're talking about there, and it doesn't matter whether count uses foreign_id or * there since it's a one-column view. You can't really say any SQL is "wrong" unless you can give an example of when it would return the wrong results. Do that and I'll take you seriously.
Mark Peters
@Mark Peters: If you review the explain plan, using this query will require two passes - the first to deal with the inner query, the second to deal with the outer. Because there's no indexes to leverage when running the outer query, it's guaranteed to be table scan. While the query returns the correct values, it is not the most efficient means of doing so. Your query still lists the DISTINCT keyword on the inner query... which also has a GROUP BY clause defined. The GROUP BY means that only distinct values of string (which isn't necessary as a column) are returned - no need for DISTINCT.
OMG Ponies
@Ponies: I already agreed that the one-pass approach is better. That doesn't mean this is wrong, just not as good. A query is only wrong if it doesn't return the right results. The DISTINCT keyword IS required as the query stands now as it eliminates the false negatives/positives we talked about in Mark's answer. It has nothing to do with the GROUP BY clause which you keep incorrectly thinking is on the inner query, but isn't. I don't know standard SQL indentation conventions but I'll fix it to try to make it more clear.
Mark Peters