1) Yes, the second query you give looks better. But I give +1 to @Russell Steen's solution because it avoids the use of correlated subqueries.
This is a variation of the greatest-n-per-group problem that I see frequently on SO.
Here's another possible solution:
SELECT f.*
FROM Films f
LEFT OUTER JOIN Films p
ON (f.CategoryId = p.CategoryId AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r
ON (f.CategoryId = r.CategoryId AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE p.CategoryId IS NULL AND r.CategoryId IS NULL;
The explanation is that we try to find a film "p
" in the same category, with a lower price. When we have found none, p.*
will be NULL because that's how outer joins work. When there are no dvd's with a lower price, we've found the one with lowest price.
We further try the same trick to find a film "r
" with the highest rating. This time we restrict to films in the same category and with the same price (that is, the lowest price) as the film f
. Otherwise we'd unintentionally find the film with the highest rating in the category, even if it isn't cheap.
You can also reverse the order of the joins, first finding the highest rating and then finding the lowest price among those with the higest rating. It depends on what you place at a greater priority -- low price or high rating. No matter what solution you use, you have to make a decision about this priority.
2) The other query you tried doesn't work because the condition you use in the subquery doesn't eliminate any of the wrong rows of the FT2 subquery. It's a "Green Eggs and Ham" problem: whether on a train or on a plane, on a boat or on a goat, you've still got green eggs and ham included in the meal.
update: Okay, thanks for the sample data. When you first asked the question, you didn't include the information that some films could be ineligible because they aren't available on DVD and have a NULL in the DVDPrice
column. Here's an updated query using my technique that returns the correct films, one per category, excluding films that aren't available on DVD, with the lowest price and highest rating:
SELECT f.FilmName, f.Rating, f.DVDPrice, f.CategoryId
FROM Films f
LEFT OUTER JOIN Films p ON (f.CategoryId = p.CategoryId
AND p.AvailableOnDvd = 'Y' AND f.DVDPrice > p.DVDPrice)
LEFT OUTER JOIN Films r ON (f.CategoryId = r.CategoryId
AND r.AvailableOnDvd = 'Y' AND f.DVDPrice = r.DVDPrice AND f.Rating < r.Rating)
WHERE f.AvailableOnDvd = 'Y' AND p.CategoryId IS NULL AND r.CategoryId IS NULL
ORDER BY f.CategoryId;
Output:
+-------------------------+--------+----------+------------+
| FilmName | Rating | DVDPrice | CategoryId |
+-------------------------+--------+----------+------------+
| The Maltese Poodle | 1 | 2.99 | 1 |
| Third | 7 | 10.00 | 2 |
| Nightmare on Oak Street | 2 | 9.99 | 3 |
| Planet of the Japes | 5 | 12.99 | 4 |
| Soylent Yellow | 5 | 12.99 | 5 |
| Sense and Insensitivity | 3 | 15.99 | 6 |
+-------------------------+--------+----------+------------+
This differs from your result in category 6, because Sense and Insensitivity in your sample data is the only film that is available on DVD. 15 Late Afternoon is not available, even though it has a non-null value for DVDPrice. If I change it to AvailableOnDvd='Y'
then 15 Late Afternoon is chosen instead of the other film.
Regarding your question about how did I solve this, it's a variation of a common question in SQL, which I have tagged the "greatest-n-per-group" question. You want the query to return every film f
such that no film exists with a lower DVDPrice
in the same category. I solve with an outer join to p
and if no matches are found in p
then f
must have the lowest price in that category. That's the common solution.
Your added twist in this problem is that you have another attribute to filter on. So given the film (or films in the case of ties) with the lowest price, you want the one with the highest rating. The technique is the same, to use an outer join to r
where the category and price is equal, and the rating is higher. When no such films are found with a higher rating, then f
must have the highest rating for a given category and price.
I'm going to add a tag to your question greatest-n-per-group
so you can follow it and view other SQL questions that are solved with the same technique.