views:

73

answers:

2
SELECT DISTINCT group_id
       , supplier_id
       , supplier_name
       , site_division_id
       , site_division_name 
FROM view_supplier_site 
WHERE supplier_id IN (SELECT DISTINCT supplier_id 
                      FROM view_supplier 
                      WHERE YEAR IN (2008, 2009) 
                      AND received_quantity > 0 
                      AND COE_SUPPLIER NOT IN ('X - LG', 'Y - LG', 'Z - LG') )
+3  A: 

Non Subquery Factoring:

  SELECT vss.group_id, 
         vss.supplier_id, 
         vss.supplier_name, 
         vss.site_division_id, 
         vss.site_division_name 
    FROM view_supplier_site vss
    JOIN (SELECT vs.supplier_id
            FROM view_supplier vs
           WHERE vs.year IN (2008, 2009)  
             AND vs.received_quantity > 0 
             AND vs.coe_supplier NOT IN ('X - LG', 'Y - LG', 'Z - LG')
        GROUP BY vs.supplier_id) s ON s.supplier_id = vss.supplier_id
GROUP BY vss.group_id, vss.supplier_id, vss.supplier_name, vss.site_division_id, vss.site_division_name

Using subquery factoring:

WITH suppliers AS (
    SELECT vs.supplier_id
      FROM view_supplier vs
     WHERE vs.year IN (2008, 2009)  
       AND vs.received_quantity > 0 
       AND vs.coe_supplier NOT IN ('X - LG', 'Y - LG', 'Z - LG')
  GROUP BY vs.supplier_id)
  SELECT vss.group_id, 
         vss.supplier_id, 
         vss.supplier_name, 
         vss.site_division_id, 
         vss.site_division_name 
    FROM view_supplier_site vss
    JOIN suppliers s ON s.supplier_id = vss.supplier_id
GROUP BY vss.group_id, vss.supplier_id, vss.supplier_name, vss.site_division_id, vss.site_division_name

They're equivalent.

Far as I can see, there isn't a lot of optimization to be had. Next thing to look at would be indexes...

OMG Ponies
wow!! pat on your back, dude. Your 'Non Subquery' just took 3.5sec(avg) time to execute while mine was taking about 14sec.
HanuAthena
I expect that they would optimise to pretty much the same execution plan, other than the use of DISTINCT (which the CBO would probably apply to the subquery in the OP's quetion efven if it wasn't specified). It would be interesting to see the execution plans for the 3.5 sec and the 14 sec queries to understand where the difference is.
David Aldridge
+1  A: 

I'd suggest the following:

  • Instead of using a n IN, use an inner join (probably not a performance improvement, but the statement looks "nicer")
  • This would allow you to get rid of the DISTINCT on view_supplier (again, probably no performance difference)
  • is the DISTINCT on view_supplier needed? There are some ids that may be keys for supplier_site.
  • NOT IN may be a performance problem .. can you change this into something else like <'X - LG' or so?
  • if the views are more than just "aliases" for the base tables/columns, there may be ways of using the underlying tables.
  • Another thing to look at would be indexes.
  • Is YEAR a calculated column? If it is just YEAR(datevalue) it may be faster to use something like datevalue between <Jan1st> and <Dec31>

Most of these changes would be cosmetic, the areas on what to focus on would be determined by the problems you see with the statement.

IronGoofy