To answer your question about using the EXISTS keyword, here is an example query that uses an EXISTS predicate, based on the query as currently given in your question.
SELECT t.*
FROM tblTransaction t
WHERE EXISTS
(
SELECT 1
FROM tblTenantTransCode ttc
JOIN tblCheckbookCode cc
ON (cc.ID = ttc.CheckbookCode AND cc.Description='Rent Income')
WHERE ttc.ID = t.TransactionCode
)
Additional Details:
We all recognize that there are a variety of SQL statements that will return the result set that meets the specified requirements. And there are likely going to be differences in the observed performance of those queries. Performance is particularly dependent on the DBMS, the optimizer mode, the query plan, and the statistics (number of rows and data value distribution).
The EXISTS
makes it clear we aren't interested returning expressions from tables in the subquery. It logically separates the subquery from the outer query, in a way that a JOIN
does not.
Another advantage of using EXISTS
is that avoids returning duplicate rows that would be (might be) returned if you were to instead use a JOIN
.
An EXISTS
predicate can be used to test for the existence of any related row in a child table, without requiring a join. For example, the following query returns a set of all orders that have at least one associated line_item:
SELECT o.*
FROM order o
WHERE EXISTS
( SELECT 1
FROM line_item li
WHERE li.order_id = o.id
)
Note that the subquery doesn't need to find ALL matching line items, it only needs to find one row in order to satisfy the condition.
A NOT EXISTS
predicate is also useful, for example, to return a set of orders that do not have any associated line_items.
SELECT o.*
FROM order o
WHERE NOT EXISTS
( SELECT 1
FROM line_item li
WHERE li.order_id = o.id
)
Of course, NOT EXISTS
is just one alternative. An equivalent result set could be obtained using an OUTER join and an IS NULL test (assuming we have at least one expression available from the line_item table that is NOT NULL)
SELECT o.*
FROM order o
LEFT
JOIN line_item li ON (li.order_id = o.id)
WHERE li.id IS NULL
There is a lot of discussion (relating to answers to the original question) about needing to use an IN
predicate, or needing to use an INNER JOIN
. It's not really a matter that those constructs are needed. Actually, the query could be written without an IN
and without an INNER JOIN
. It could be written using just the EXISTS
predicate. (Note that the title of the OP question did ask about how to use the EXISTS keyword.)
This is not my first choice for how to write the query, but the result set returned does satisfy the specified requirements:
SELECT t.*
FROM tblTransaction t
WHERE EXISTS
(
SELECT 1
FROM tblTenantTransCode ttc
WHERE ttc.ID = t.TransactionCode
AND EXISTS
(
SELECT 1
FROM tblCheckbookCode cc
WHERE cc.ID = ttc.CheckbookCode
AND cc.Description = 'Rent Income'
)
)
Of primary importance, the query should return a correct result set, one that satisfies the specified requirements, given all possible sets of conditions. Some of the queries presented as answers here do NOT return the requested result set, or if they do, they happen to do so by accident. Some of the queries will work if we pre-assume something about the data, such that some columns are UNIQUE
and NOT NULL
.
Performance differences
Sometimes a query with an EXISTS
predicate will not perform as well as a query with a JOIN
or an IN
predicate. In some cases, it may perform better. (With the EXISTS
predicate, the subquery only has to find one row that satisfies the condition, rather than finding ALL matching rows, as would be required by a JOIN
.)
Performance of various query options is best gauged by observation.