ansaurus

Question

Answer 1

+1 A:

Try this:

SELECT
    tblTenantTransCode.ID 
    FROM tblCheckbookCode 
        INNER JOIN tblTenantTransCode ON tblCheckbookCode.ID=tblTenantTransCode.CheckbookCode
    WHERE tblCheckbookCode.Description = 'Rent Income'

Make sure you index tblCheckbookCode.Description.

KM 2009-05-26 20:18:27

Answer 2

+3 A:

You are describing an inner join.

select tc.id 
from tblTenantTransCode tc 
   inner join tblCheckbookCode cc on tc.CheckbookCode = cc.CheckbookCode

EDIT: It's still an inner join. I don't see any reason yet to use the IN clause.

select *
from tblTransaction t
   inner join tblTenantTransCode tc on tc.id = t.TransactionCode
   inner join tblCheckbookCode cc on cc.id = tc.CheckbookCode
where cc.description = 'Rent Income'

EDIT: If you must use the EXISTS predicate to solve this problem, see @spencer7953's answer. However, from what I'm seeing, the solution above is simpler and there are assumptions of uniqueness based on the fact that "Subquery" works for you (it wouldn't 100% of the time if there wasn't uniqueness in that table). I'm also addressing

Now I want to select All Transactions where their transactioncode matches an ID returned in the subquery

in my answer. If the request were something on the lines of this:

Now I want to select All Transcations when any transactioncode matches an ID returned in the subquery.

I would use EXISTS to see if any transactioncode existed in the child table and return every row or none as appropriate.

Austin Salonen 2009-05-26 20:20:09

I'd add that you would do a "where cc.Description = 'Rent Income'" to the end of that statement.

Eric 2009-05-26 20:21:34

Not quite. Sorry I wasn't descriptive enough, the subquery has it's own subquery, which could possibly be replaced with an inner join, but I'm not sure.

Malfist 2009-05-26 20:25:48

@Malfist: Joins are remarkably more efficient than IN clauses. Can you post the entirety of your query so you can get the most efficiency?

Eric 2009-05-26 20:28:31

Sure, I'll post up more info

Malfist 2009-05-26 20:30:06

Eric,What makes you think that "Joins are remarkably more efficient than IN clauses."? This is just not true.

AlexKuznetsov 2009-05-26 21:03:22

I should elaborate: IN clauses with subqueries. Subqueries, especially when they use an element from the parent query, fire every row, making it wildly inefficient. Doing IN (1,2,3) is not inefficient at all. Doing IN (SELECT ID FROM A WHERE A.ID = SOME_PARENT_ID) is extremely inefficient.

Eric 2009-05-26 21:16:06

That's just not true. If you are joining on a primary key, and not selecting columns from the second table, then they will have the same performance (and same query plan). If you're NOT joining on a primary key, then the join may cause duplicate rows. I.E. if tblCheckbookCode has 2 rows for 'Rent Income', then you'll see the transaction repeated twice, whereas with the subquery you'd only see it once.However, it's probably better practice to use a where exists clause, since you can easily extend it to match more than one column. But for one column, a where exists and an "in" are the same.

John Gibb 2009-05-27 12:33:17

According to my profiler, they both execute in about 15 ms

Malfist 2009-05-27 15:47:52

All of the rows are either a primary key, or a unique index so...

Malfist 2009-05-27 15:48:42

Actually, the OP question asked about how to use the **EXISTS** keyword. Certainly, the query could be written as a JOIN, or using an IN predicate. It could also be written using an EXISTS predicate.

spencer7593 2009-05-28 14:06:35

@spencer7593 -- the title of the question is about the exists keyword. When matched with the body of the question, the "exists" part is only an implication that using "exists" is the only way to do what the OP wants. While you could use an "exists" in several scenarios, it's not always necessary.

Austin Salonen 2009-05-28 14:13:44

@Austin Salonen: your answer totally ignored the possibility of using an EXISTS keyword. Your answer implies that an INNER JOIN is the only possible solution. "you are describing an inner join" and "it's still an inner join, i don't see any reason to use the IN clause". There are, in fact, reasons one might want to use an IN predicate or an EXISTS predicate. Your solution using the INNER join ASSUMES information about the uniqueness of columns in the tables.

spencer7593 2009-05-28 14:44:08

@spencer7593 -- Yes there are reasons you'd want to use other approaches. However, with the given schema and the fact that "Subquery" works without errors (it will error with duplicates in Description), you can assume uniqueness and using inner joins is the simplest solution.

Austin Salonen 2009-05-28 15:03:07

@Austin: excellent points. But just because a subquery returns only one row when it is run, doesn't mean it won't return more than one row some other time. What you need is some provision that guarantees it, one successful result set may demonstrate that some column had unique values when the query when the query was run, but it doesn't guarantee that some other row won't be added later.

spencer7593 2009-05-30 04:52:42

Answer 3

+1 A:

You need to use the 'IN' clause:

select id from tblTenantTransCode
where tblTenantTransCode.CheckbookCode in
    (select id from tblCheckbookCode
     where description = 'rent income')

an inner join would probably be a better solution though...

select ttc.id from tblTenantTransCode as ttc
inner join tblCheckbookCode as tcc
    on ttc.CheckBookId = tcc.id
where tcc.description = 'rent income'

Justin Niessner 2009-05-26 20:20:10

Answer 4

+1 A:

Given your full query, this query will get you where you need to go using a single join.

The join filters out any transaction that doesn't have a transaction code of 'Rent Income.' It will take all record from the first table, build out the subset of the second table (that WHERE clause limits the records), and then filters the first table where those table math the join condition.

SELECT 
    t.* 
FROM 
    tblTransaction t
    INNER JOIN tblTenantTransCode c ON
        t.TransactionCode = c.ID
    INNER JOIN tblCheckbookCode chk ON
        c.CheckbookCode = chk.ID
WHERE
    chk.Description = 'Rent Income'

Edit: One other note: Avoid using SELECT * -- always specify the columns. Edit Dos: I missed that there were three tables. Corrected! Thanks, spencer!

Eric 2009-05-26 21:07:13

If my memory doesn't fail me, you could have even better performance by moving the "c.Description = 'Rent Income'" from the WHERE clause to the "INNER JOIN tblTenantTransCode c ON t.TransactionCode = c.ID AND c.Description = 'Rent Income'"

Turro 2009-05-27 08:12:03

It doesn't matter (at least in SQL Server). The optimizer will feel free to rearrange joins and where clauses as long as they are equivalent, and with an inner join it NEVER matters whether the condition is on the join, or in the where clause. I would go with Eric's solution since it is more apparent to the next developer what's going on, but really it doesn't matter.

John Gibb 2009-05-27 12:36:27

@John Gibb: but getting the correct result set is important. Eric's query doesn't give the desired result set, since the original question indicates that the 'Rent Income' value is found in the Description column of the tblCheckbookCode table, not the tblTenantTransCode table.

spencer7593 2009-05-27 15:17:54

@spencer: Thanks, I missed the third table!

Eric 2009-05-27 18:51:34

+1 @Eric, now the statement looks like it will return the specified result set. But note what happens with the result set if there happen to be TWO (or more) rows in tblCheckbookCode with 'Rent Income' as Description, the JOIN query will return "duplicate" rows, using an EXISTS predicate avoids that problem.

spencer7593 2009-06-24 14:45:57

Answer 5

+10 A:

To answer your question about using the EXISTS keyword, here is an example query that uses an EXISTS predicate, based on the query as currently given in your question.

    SELECT t.*
      FROM tblTransaction t
     WHERE EXISTS
           ( 
             SELECT 1
               FROM tblTenantTransCode ttc
               JOIN tblCheckbookCode cc
                 ON (cc.ID = ttc.CheckbookCode AND cc.Description='Rent Income')
              WHERE ttc.ID = t.TransactionCode
           )

Additional Details:

We all recognize that there are a variety of SQL statements that will return the result set that meets the specified requirements. And there are likely going to be differences in the observed performance of those queries. Performance is particularly dependent on the DBMS, the optimizer mode, the query plan, and the statistics (number of rows and data value distribution).

The EXISTS makes it clear we aren't interested returning expressions from tables in the subquery. It logically separates the subquery from the outer query, in a way that a JOIN does not.

Another advantage of using EXISTS is that avoids returning duplicate rows that would be (might be) returned if you were to instead use a JOIN.

An EXISTS predicate can be used to test for the existence of any related row in a child table, without requiring a join. For example, the following query returns a set of all orders that have at least one associated line_item:

    SELECT o.*
      FROM order o
     WHERE EXISTS
           ( SELECT 1
               FROM line_item li
              WHERE li.order_id = o.id
           )

Note that the subquery doesn't need to find ALL matching line items, it only needs to find one row in order to satisfy the condition.

A NOT EXISTS predicate is also useful, for example, to return a set of orders that do not have any associated line_items.

    SELECT o.*
      FROM order o
     WHERE NOT EXISTS
           ( SELECT 1
               FROM line_item li
              WHERE li.order_id = o.id
           )

Of course, NOT EXISTS is just one alternative. An equivalent result set could be obtained using an OUTER join and an IS NULL test (assuming we have at least one expression available from the line_item table that is NOT NULL)

    SELECT o.*
      FROM order o
      LEFT
      JOIN line_item li ON (li.order_id = o.id)
     WHERE li.id IS NULL

There is a lot of discussion (relating to answers to the original question) about needing to use an IN predicate, or needing to use an INNER JOIN. It's not really a matter that those constructs are needed. Actually, the query could be written without an IN and without an INNER JOIN. It could be written using just the EXISTS predicate. (Note that the title of the OP question did ask about how to use the EXISTS keyword.)

This is not my first choice for how to write the query, but the result set returned does satisfy the specified requirements:


    SELECT t.*
      FROM tblTransaction t
     WHERE EXISTS
           ( 
             SELECT 1
               FROM tblTenantTransCode ttc
              WHERE ttc.ID = t.TransactionCode
                AND EXISTS 
                    (
                      SELECT 1 
                        FROM tblCheckbookCode cc
                       WHERE cc.ID = ttc.CheckbookCode
                         AND cc.Description = 'Rent Income'
                    )
           )

Of primary importance, the query should return a correct result set, one that satisfies the specified requirements, given all possible sets of conditions. Some of the queries presented as answers here do NOT return the requested result set, or if they do, they happen to do so by accident. Some of the queries will work if we pre-assume something about the data, such that some columns are UNIQUE and NOT NULL.

Performance differences

Sometimes a query with an EXISTS predicate will not perform as well as a query with a JOIN or an IN predicate. In some cases, it may perform better. (With the EXISTS predicate, the subquery only has to find one row that satisfies the condition, rather than finding ALL matching rows, as would be required by a JOIN.)

Performance of various query options is best gauged by observation.

spencer7593 2009-05-26 21:21:44

"The EXISTS also makes it clear you aren't interested in [rows from the inner tables" - also known as a "semi-join", and may show up in query plans as that. (I've seen SQL Server use a "left anti semi join" node, which was a bit startling the first time)Performance of the three methods of doing roughly the same thing vary wildly between different DB platforms and the data distribution. As ever, look at the query plan, don't guess.

araqnid 2009-05-27 01:18:12

@araqnid: I concur. Better than just looking at the query plan is to measure the _actual performance_ of each query plan, which is easy to do on some database platforms.

spencer7593 2009-05-27 15:21:04

This is a really good answer, +1

Factor Mystic 2010-06-10 15:37:13

ansaurus

tags:

views:

answers:

How do I use T-SQL's Exists keyword?

Subquery:

Full Query:

Tables:

related questions