views:

110

answers:

4
    SELECT AVG(`col5`)
    FROM `table1`
    WHERE `id` NOT IN (
        SELECT `id` FROM `table2`
        WHERE `col4` = 5
    )
    group by `col2` having sum(`col3`) > 0
UNION
    SELECT MAX(`col5`)
    FROM `table1`
    WHERE `id` NOT IN (
        SELECT `id` FROM `table2`
        WHERE `col4` = 5
    )
    group by `col2` having sum(`col3`) = 0

For readability and performance reasons, I think this code could be refactored. But how?

EDITIONS

  1. removed the outer select

  2. made the first select to return a sum and the second one to return another value

  3. replaced the SUM by AVG

+4  A: 
SELECT * 
FROM table1 t1
left outer join table2 t2 on t1.id = t2.id and t2.col4 = 5
where t2.id is null
group by t1.col2 
having sum(col3) >= 0 

The outer select is missing the FROM clause and was not adding anything so I removed it. The NOT IN is inefficient compared to the LEFT OUTER JOIN method so I replaced that. The two UNIONs were easily combined into one by using >=.

Update: Note the use of UNION ALL rather than UNION. I don't think you want to remove duplicates, and it will perform faster this way.

SELECT AVG(t1.col5) 
FROM table1 t1 
left outer join table2 t2 on t1.id = t2.id and t2.col4 = 5 
where t2.id is null 
group by t1.col2  
having sum(t1.col3) > 0  
UNION ALL
SELECT MAX(t1.col5) 
FROM table1 t1 
left outer join table2 t2 on t1.id = t2.id and t2.col4 = 5 
where t2.id is null 
group by t1.col2  
having sum(t1.col3) = 0  
RedFilter
Thanks for your suggestions: I removed the outer select, I can use your `left outer join` suggestion, but I can't combine the `UNION` because the code is slightly different from what I wrote first. I rewrote the question to show why it is difficult to combine both statements.
Jader Dias
Ok, no problem. I'll post the solution in a moment.
RedFilter
You're right, SUM and MAX don't mix well. The real operations I use are AVG and MAX. I edited the question to reflect this.
Jader Dias
Note I used `UNION ALL`, not `UNION`, for better performance; updated query to use `AVG`.
RedFilter
+1  A: 
SELECT t1.* FROM table1 t1 LEFT OUTER JOIN table2 t2 ON t1.id = t2.id 
WHERE t2.col4 <> 5 AND SUM(t1.col3) > 0 GROUP BY t1.col2
Tom
the real code is slightly different from what I wrote at first. I rewrote the question to show why it is difficult to combine both statements
Jader Dias
A: 

I'm guessing that you want this:

SELECT * FROM `table`
WHERE col2 IN
(SELECT col2
    FROM `table1` 
    WHERE `id` NOT IN ( 
        SELECT `id` FROM `table2` 
        WHERE `col4` = 5 
    ) 
    group by `col2` having sum(`col3`) >= 0
)

When using GROUP BY, you should only return columns that are named in the GROUP BY clause or that include an aggregate function. Therefore, the inner SELECT here gets the col2 values where the sum is greater than or equal to zero, then the outer SELECT grabs the entire row for those values.

Marcus Adams
A: 
SELECT
    IF(SUM(`table1`.`col3`) > 0, AVG(`table1`.`col5`), MAX(`table1`.`col5`))
FROM `table1`
    LEFT JOIN `table2` ON `table2`.`id` = `table1`.`id` AND `table2`.`col4` = 5
WHERE `table2`.`id` IS NULL
GROUP BY `table1`.`col2`
HAVING SUM(`table1`.`col3`) >= 0

Also * is considered harmful. If you want to make your query forward-compatible with possible future changes to your DB model, specify columns by their names.

codeholic
the real code is slightly different from what I wrote at first. I rewrote the question to show why it is difficult to combine both statements
Jader Dias