views:

54

answers:

2

I have a table Books, where I store Book data (ISBN's, Titles, Authors, etc.). To tell which books are editions of each other I have a field Edition_Group_ISBN, which is an arbitrary ISBN from the group.

I'm having trouble getting this query, which is supposed to give the Book data and the number of other Editions based on the ISBN, to work:

   SELECT *, Editions_Count 
     FROM Books 
LEFT JOIN ((SELECT Edition_Group_ISBN, COUNT(*) AS Editions_Count 
              FROM Books 
             WHERE Edition_Group_ISBN IN (SELECT Edition_Group_ISBN 
                                            FROM Books)
             GROUP BY Edition_Group_ISBN) AS b 
       ) ON (Books.Edition_Group_ISBN = b.Edition_Group_ISBN 
           AND Books.Edition_Group_ISBN != NULL) 
    WHERE ISBN = 9780140447897

The query gives the book data for 9780140447897, but it gives the Editions_Count AS NULL, indicating that the LEFT JOIN isn't working.

+5  A: 

Try this much simpler query:

SELECT b.*, COUNT(*) AS Editions_count
FROM Books b JOIN Books g USING (Edition_Group_ISBN)
WHERE b.ISBN = 9780140447897
GROUP BY b.book_id;

I think you were making the problem much harder than you needed to. Your original query is full of confusing subqueries, at least one of which is totally superfluous.


Re your comments:

Yes, it works fine to compare ISBN to multiple values this way:

SELECT b.*, COUNT(*) AS Editions_count
FROM Books b JOIN Books g USING (Edition_Group_ISBN)
WHERE b.ISBN IN (9780140447897, 9781934356555)
GROUP BY b.book_id;

COUNT(*) supports only a plain wildcard * which means count all rows in the group.

Or you can use a specific expression like COUNT(g.Edition_Group_ISBN) which means count all rows in the group where that expression is non-null.

But you can't use g.* because it's ambiguous. Does it count all rows in the group? (if so, just use COUNT(*)) Does it count non-null rows in the group? What would that mean anyway -- count rows where all columns from g are non-null, or where any columns from g are non-null? For these reasons, COUNT(g.*) is simply not a legal construction in the SQL language.

Bill Karwin
The GROUP BY clause is only sufficient in MySQL; other DBMS would make you cite all the columns in the Books table (the 'b' alias). Since the question is tagged MySQL, that isn't an immediate problem, but for generality, it should be noted (or you could point them to a book called 'SQL Anti-Patterns').
Jonathan Leffler
Bill, is there a way to make this query work with multiple ISBNs being provided?
babonk
I tried this but it gets an error because of COUNT(g.*)SELECT b.*, COUNT(g.*) AS Editions_countFROM Books b JOIN Books g USING (Edition_Group_ISBN)WHERE b.ISBN IN(9780140447897)
babonk
great, thank you very much!
babonk
@Jonathan: Yep, thanks, MySQL is more permissive about GROUP BY than standard SQL and most other implementations. SQLite is also permissive in this way.
Bill Karwin
+1  A: 

The query seems to include a number of tautologies - statements that are always true.

I think you can simplify it to

 SELECT books.*, b.Editions_Count 
     FROM Books 
LEFT JOIN ((SELECT Edition_Group_ISBN, COUNT(*) AS Editions_Count 
              FROM Books 
              GROUP BY Edition_Group_ISBN) AS b 
       ) ON (Books.Edition_Group_ISBN = b.Edition_Group_ISBN) 
 WHERE ISBN = 9780140447897

The subselect (WHERE .. IN) and the JOIN condition ID!=NULL were redundant.

You can achieve the same thing by doing a self join on the Edition_Group_ISBN:

SELECT b.*, count(b2.*)
FROM Books b INNER JOIN Books b2 ON (b.Edition_Group_ISBN=b2.Edition_Group_ISBN)
GROUP BY b.ISBN
HAVING b.ISBN = ...;

EDIT: The fix is to remove the JOIN condition Edition_Group_ISBN!=NULL, since this expression will always be false. (A NULL operand to != returns a NLLL result.) This makes the entire join condition NULL (false) and so the left join fails.

mdma
Actually `ID != NULL` was simply incorrect, and would need to be `ID IS NOT NULL`
George Marian
Thanks for picking that up. That's probably why it is not working. The comparator operator != returns NULL if one of it's operands is NULL. I was too busy cleaning up to notice first time.
mdma
George, the query works thanks to that fix. Now I'm going to take some time to test out some of the better queries suggested here before choosing an answer.
babonk
Hmm, I'm actually curious as to why ID != NULL prevented the query from working..
babonk
Please see my edit. a NULL operand to != preduces a NULL result.
mdma
@babonk In the relational model, no two NULLs are alike. Comparing them returns an Unknown result. The simple answer is that a value may be NULL due to unknown data or unspecified data. Those two situations are not the same. IIRC, this has been expanded to 19 possible reasons for a NULL, though I can't seem to find a reference to that. This Wikipedia article does a good job of explaining many of the gory details: http://en.wikipedia.org/wiki/Null_%28SQL%29
George Marian