views:

95

answers:

1

I have the following query That gets the city name, city id, the region name, and a count of duplicate names for that record:

SELECT Country_CA.City AS currentCity, Country_CA.CityID, globe_region.region_name,
( SELECT count(Country_CA.City) FROM Country_CA WHERE City LIKE currentCity ) as counter 
    FROM Country_CA
    LEFT JOIN globe_region
      ON globe_region.region_id = Country_CA.RegionID
      AND globe_region.country_code = Country_CA.CountryCode
    ORDER BY City

This example is for Canada, and the cities will be displayed on a dropdown list.

There are a few towns in Canada, and in other countries, that have the same names. Therefore I want to know if there is more than one town with the same name region name will be appended to the town name. Region names are found in the globe_region table.

Country_CA and globe_region look similar to this (I have changed a few things for visualization purposes)

CREATE TABLE IF NOT EXISTS `Country_CA` (
  `City` varchar(75) NOT NULL DEFAULT '',
  `RegionID` varchar(10) NOT NULL DEFAULT '',
  `CountryCode` varchar(10) NOT NULL DEFAULT '',
  `CityID` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`City`,`RegionID`),
  KEY `CityID` (`CityID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

AND

CREATE TABLE IF NOT EXISTS `globe_region` (
  `country_code` char(2) COLLATE utf8_unicode_ci NOT NULL,
  `region_code` char(2) COLLATE utf8_unicode_ci NOT NULL,
  `region_name` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`country_code`,`region_code`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

The query on the top does exactly what I want it to do, but It takes way too long to generate a list for 5000 records. I would like to know if there is a way to optimize the sub-query in order to obtain the same results faster.

the results should look like this


City            CityID          region_name       counter

sheraton        2349269         British Columbia   1
sherbrooke      2349270         Quebec             2
sherbrooke      2349271         Nova Scotia        2
shere           2349273         British Columbia   1
sherridon       2349274         Manitoba           1
+2  A: 

Indexing on Country_CA.City might be enough to speed up the sub-select, as it won't need to consult the table. Changing from LIKE to = might make a difference. The documentation states that LIKE can only use an index if one of the arguments is a constant string that doesn't begin with a wildcard. Country_CA.City shouldn't begin with a wildcard. It's not strictly speaking a constant string, though it is constant for each subquery.

Alternatively, a join might be faster.

SELECT Country_CA.City AS currentCity, Country_CA.CityID, globe_region.region_name,
       count(homonyms.City) as counter 
    FROM Country_CA
    LEFT JOIN globe_region
      ON globe_region.region_id = Country_CA.RegionID
      AND globe_region.country_code = Country_CA.CountryCode
    JOIN Country_CA AS homonyms
      ON Country_CA.City = homonyms.City
    GROUP BY Country_CA.CityID
    ORDER BY City
outis