views:

1334

answers:

4

I've just learned ( yesterday ) to use "exists" instead of "in".

 BAD
 select * from table where nameid in ( 
          select nameid from othertable where otherdesc =  'SomeDesc' )      
 GOOD
 select * from table t where exists ( 
          select nameid from othertable o where t.nameid = o.nameid and otherdesc =  'SomeDesc' )

And I have some questions about this:

1) The explanation as I understood was: "The reason why this is better is because only the matching values will be returned instead of building a massive list of possible results". Does that mean that while the first subquery might return 900 results the second will return only 1 ( yes or no )?

2) In the past I have had the RDBMS complainin: "only the first 1000 rows might be retrieved", this second approach would solve that problem?

3) What is the scope of the alias in the second subquery?... does the alias only lives in the parenthesis?

for example

 select * from table t where exists ( 
          select nameid from othertable o where t.nameid = o.nameid and otherdesc =  'SomeDesc' )      
 AND 
          select nameid from othertable o where t.nameid = o.nameid and otherdesc =  'SomeOtherDesc' )

That is, if I use the same alias ( o for table othertable ) In the second "exist" will it present any problem with the first exists? or are they totally independent?

Is this something Oracle only related or it is valid for most RDBMS?

Thanks a lot

+2  A: 

It's specific to each DBMS and depends on the query optimizer. Some optimizers detect IN clause and translate it.

In all DBMSes I tested, alias is only valid inside the ( )

BTW, you can rewrite the query as:

select t.* 
from table t 
join othertable o on t.nameid = o.nameid 
    and o.otherdesc in ('SomeDesc','SomeOtherDesc');

And, to answer your questions:

  1. Yes
  2. Yes
  3. Yes
Milan Babuškov
+1  A: 

Personally I would use a join, rather than a subquery for this.

SELECT t.*
FROM yourTable t
    INNER JOIN otherTable ot
        ON (t.nameid = ot.nameid AND ot.otherdesc = 'SomeDesc')
Mitchel Sellers
+3  A: 
  1. Oracle-specific: When you write a query using the IN clause, you're telling the rule-based optimizer that you want the inner query to drive the outer query. When you write EXISTS in a where clause, you're telling the optimizer that you want the outer query to be run first, using each value to fetch a value from the inner query. See "Difference between IN and EXISTS in subqueries".
  2. Probably.
  3. Alias declared inside subquery lives inside subquery. By the way, I don't think your example with 2 ANDed subqueries is valid SQL. Did you mean UNION instead of AND?
Constantin
+1  A: 

You are treading into complicated territory, known as 'correlated sub-queries'. Since we don't have detailed information about your tables and the key structures, some of the answers can only be 'maybe'.

In your initial IN query, the notation would be valid whether or not OtherTable contains a column NameID (and, indeed, whether OtherDesc exists as a column in Table or OtherTable - which is not clear in any of your examples, but presumably is a column of OtherTable). This behaviour is what makes a correlated sub-query into a correlated sub-query. It is also a routine source of angst for people when they first run into it - invariably by accident. Since the SQL standard mandates the behaviour of interpreting a name in the sub-query as referring to a column in the outer query if there is no column with the relevant name in the tables mentioned in the sub-query but there is a column with the relevant name in the tables mentioned in the outer (main) query, no product that wants to claim conformance to (this bit of) the SQL standard will do anything different.

The answer to your Q1 is "it depends", but given plausible assumptions (NameID exists as a column in both tables; OtherDesc only exists in OtherTable), the results should be the same in terms of the data set returned, but may not be equivalent in terms of performance.

The answer to your Q2 is that in the past, you were using an inferior if not defective DBMS. If it supported EXISTS, then the DBMS might still complain about the cardinality of the result.

The answer to your Q3 as applied to the first EXISTS query is "t is available as an alias throughout the statement, but o is only available as an alias inside the parentheses". As applied to your second example box - with AND connecting two sub-selects (the second of which is missing the open parenthesis when I'm looking at it), then "t is available as an alias throughout the statement and refers to the same table, but there are two different aliases both labelled 'o', one for each sub-query". Note that the query might return no data if OtherDesc is unique for a given NameID value in OtherTable; otherwise, it requires two rows in OtherTable with the same NameID and the two OtherDesc values for each row in Table with that NameID value.

Jonathan Leffler