views:

173

answers:

2

Alright, this one (3a; sample problem with provided answer) has got me scratching my head:

bbc(name, region, area, population, gdp)
3a. Find the largest country in each region:

SELECT region, name, population
  FROM bbc x
 WHERE population >= ALL
    (SELECT population
       FROM bbc y
      WHERE y.region = x.region
        AND population > 0)

I understand the concept of 'WHERE y.region = x.region' when I think about it in terms of the db engine looping over the table entries and matching each x.region with the current y.region (in the nested SELECT)... but wtf does 'AND population > 0' do? It isn't a right answer without it, but I don't see how not...

+2  A: 

I am speculating it here.

What if population is null for all of the records of a specific region?

EDIT: It can be considered kind of a safety-net to ignore -ve values in population (not a real life scenario).

shahkalpesh
I don't recall it specifying that it was. And I haven't even started to cover the null stuff with that tutorial, yet.
Daddy Warbox
Negative values would not be a problem.
A. Rex
+1  A: 

That clause is there just because there is an entry in the Europe table (for the Vatican) which has NULL in the population column. The following works also and I believe is more understandable:

SELECT region, name, population
  FROM bbc x
 WHERE population >= ALL
    (SELECT population
       FROM bbc y
      WHERE y.region = x.region
        AND population IS NOT NULL)

In the MySQL documentation for ALL subqueries, there's a helpful comment (emphasis theirs):

In general, tables containing NULL values and empty tables are "edge cases." When writing subquery code, always consider whether you have taken those two possibilities into account.

A. Rex
Oh thank you! Jeez I thought I was being dumb or something. :P
Daddy Warbox
Maybe I still am. :P
Daddy Warbox