views:

58

answers:

5

I am new to Sql server, and my MySql background is only decent, nothing exceptional.

I have a table with 5 columns:

Neighbrhood
City
State
zip_code
zip_percent

The goal of the query: To provide a list of neighborhoods (unique) within a city and the zip_code that corresponds with the highest zip_percent. Thus returning 2 columns:

Neighborhood
zip_code

Where the zip_code is the "most representative" of the neighborhood, by virtue of being of the highest percent coverage or "zip_percent"

Issues: Since a neighborhood can overlap multiple zip_codes I am having an issue getting them distinctly. The distinct clause prevents me from ordering by an unselected column.

Somehow I think I need to take this subquery and join it back onto the original table using a MAX command on the zip_percent column. I just can't quite put all the pieces together. I would really appreciate any help.

git.github.com/469915

You can see an example of the issue here with Neighborhoods like 'Arbor Heights' and 'Rainer Beach'

+1  A: 
SELECT     Q1.Neighborhood, 
           Q1.Zip
FROM       TableName AS Q1
LEFT JOIN (SELECT Neighborhood, MAX(zip_percent) as Zip_Percent FROM TableName GROUP BY Neighborhood) AS Q2
ON         Q1.Zip_Percent = Q2.Zip_Percent 
AND        Q1.Neighborhood = Q2.Neighborhood

Keep in mind though that this could have unexpected results in Neighborhoods where the max zip percentage is shared by many zips. e.g.

+------------------------------------+
| NeighborHood | Zip   | Zip Percent |
+------------------------------------+
|  A           | 91345 | 50          |
+--------------+-------+-------------+
|  A           | 91346 | 50          |
+--------------+-------+-------------+
Goblyn27
+1 for LEFT JOIN
Chris Bednarski
A: 

You need to do a group by clause in order for distinct to work.

DISTINCT does not work on a column basis it works on a row.

So what you want to do is a sub query to get the max zippercent.

SELECT MAX(ZipperCent) AS TheMax FROM YourTable
GROUP BY ZipCode

This will give you the max zippercent based on a zip code, remember that a zip code can span multiple postal_names. For instance 98107 is seattle but so is 98116. So you do not want to group by the city name, group by the zip code.

Now that you have the max zippercents per zip code you simply use that as a sub query to grab the other information you need:

SELECT YourOtherField1, YourOtherField2, t.TheMax FROM YourTable INNER JOIN (SELECT MAX(ZipperCent) AS TheMax FROM YourTable GROUP BY ZipCode) t
ON t.YourPrimaryKey = YourTable.PrimaryKey
JonH
A: 

It would be best if you had an autoincrement ID field on the table as well to easily handle cases where there's more than one row with the same max(zip_percent) per nighborhood. Minus that point, this should be approximately correct:

with max_zp as
  select
    Neighborhood,
    max(zip_percent) zp
  from
    tbl
  group by
    Neighborhood
select
  t.NeighborHood,
  t.zipcode
from
  tbl t
  join max_zp mz on t.Neighborhood = mz.Neighborhood and t.zip_percent = mz.zp
Donnie
A: 
Select 
    Neighbrhood,
    zipcode

FROM
    geoTable g
INNER JOIN 

(   
SELECT

    zip_code,
    MAX(zip_percent) zip_percent
FROM
    geoTable
GROUP BY
        zipcode
) maxG
ON g.zip_code = maxG.zip_code
    and g.zip_percent = maxG.zip_percent
Conrad Frix
+1  A: 

Try using ROW_NUMBER:

SELECT Neighborhood, zip_code
FROM (
    SELECT 
        Neighborhood, zip_code,
        ROW_NUMBER() OVER (PARTITION BY Neighborhood
                           ORDER BY zippercent DESC) AS rn
    FROM neighbourhoods_zip
) AS T1
WHERE rn = 1
Neighborhood  zip_code
Adams         98107   
Alki          98116   
ArborHeights  98146   
Atlantic      98144   
Belltown      98121   
BitterLake    98133   
Briarcliff    98199   
etc...
Mark Byers