tags:

views:

61

answers:

5

Is there a simple way I can exclude nulls from affecting the avg? They appear to count as 0, which is not what I want. I simply don't want to take their average into account, yet here is the catch, I can't drop them from the result set, as that record has data on it that I do need.

Update:

example:

select avg(col1+col2), count(col3) from table1
where
group by SomeArbitraryCol
having avg(col1+col2) < 500 and count(col3) > 3
order by avgcol1+col2) asc;

This would be working for me, but the averages aren't accurate as they are counting null values as 0, which is really throwing off the whole average.

+3  A: 
AVG(number) 

Is the best way I can think of. This should automatically not include the nulls. Here is a little reading.

Redburn
Is this the same for number1+number2 ? Because i am not getting that, for me the nulls appear to be counting as 0, thus weighing down the avg too much.
Zombies
@Zombies No, it is not the same for a manual addition expression which follows the normal NULL rules.
Cade Roux
+3  A: 
SELECT SUM(field) / COUNT(field)
FROM table
WHERE othercondition AND (field IS NOT NULL)

Link

mmsmatt
Won't this drop the cols where field is null? I need that col though, which is what makes this tricky.
Zombies
AVG(col) is always equivalent to SUM(col) / COUNT(col), and NULLs are automatically excluded, so this query will always return the same results as SELECT AVG(field) FROM table WHERE othercondition.
Cade Roux
+1  A: 

Answer based on original question:

SELECT AVG(t1.NumCol + t1.NumCol2), COUNT(table.NumCol) 
FROM 
(
SELECT NumCol, NumCol2 
FROM table 
WHERE (cond) AND (NumCol IS NOT NULL) AND (NumCol2 IS NOT NULL)
) t1,
table
WHERE (cond)

I think with the new restrictions, this still works in theory, but isn't the most efficient manner

Fry
+1  A: 

Aggregate functions (SUM, AVG, COUNT, etc) in SQL always automatically exclude NULL.

So SUM(col) / COUNT(col) = AVG(col) - this is great and consistent.

The special case of COUNT(*) counts every row.

If you make up an expression with NULLs: A + B where either A or B is NULL, then A + B will be NULL regardless of the other column being NULL.

When there are NULLs, in general, AVG(A + B) <> AVG(A) + AVG(B), and they will likely have different denominators, too. You would have to wrap the columns: AVG(COALESCE(A, 0) + COALESCE(B, 0)) to solve that, but perhaps also exclude the case where COALESCE(A, 0) + COALESCE(B, 0).

Based on your code, I would suggest:

select avg(coalesce(col1, 0) + coalesce(col2, 0)), count(col3) from table1
where coalesce(col1, col2) is not null -- double nulls are eliminated
group by SomeArbitraryCol
having avg(coalesce(col1, 0) + coalesce(col2, 0)) < 500 and count(col3) > 3
order by avg(coalesce(col1, 0) + coalesce(col2, 0)) asc;
Cade Roux
+1  A: 

There's a good chance that you'll be able to get at the correct answer from what others have said here, but in case you have not:

Which values in your table MIGHT be NULL? And what do you want to happen if any of them are?

You never specify what results you want to see if col1 is NULL, or col2 is NULL, or if they both happen to be NULL, etc.

TehShrike