views:

131

answers:

3

Say I have a table like this:

Field1  Field2  Field3  Field4
fred    tom     fred    harry 
tom             tom
dick    harry
harry           

and I want to determine what proportion of it has been completed for each field.

I can execute:

SELECT COUNT (Field1) WHERE (Field1 <> '') AS Field1Count      
SELECT COUNT (Field2) WHERE (Field2 <> '') AS Field2Count
SELECT COUNT (Field3) WHERE (Field3 <> '') AS Field3Count      
SELECT COUNT (Field4) WHERE (Field4 <> '') AS Field4Count  

Is it possible to roll up these separate SQL statements into one that will return the 4 results in one hit? Is there any performance advantage to doing so (given that the number of columns and rows may be quite large in practice)?

+1  A: 

Here's how I would go about it using MySQL

select sum(CASE WHEN Field1 <>'' THEN 1 ELSE 0 END) as Field1Count 
     , sum(CASE WHEN Field2 <>'' THEN 1 ELSE 0 END) as Field2Count 
     , sum(CASE WHEN Field3 <>'' THEN 1 ELSE 0 END) as Field3Count 
     ...
     , sum(CASE WHEN FieldN <>'' THEN 1 ELSE 0 END as FieldNCount 
  from DataTable
lexu
+5  A: 

You can do like this:

select
  sum(case when Field1 <> '' then 1 else 0 end) as Field1Count,
  sum(case when Field2 <> '' then 1 else 0 end) as Field2Count,
  sum(case when Field3 <> '' then 1 else 0 end) as Field3Count,
  sum(case when Field4 <> '' then 1 else 0 end) as Field4Count
from TheTable
Guffa
This works fine, but I had to use IIF instead of CASE to get it to work in MS Access (I use MS Access to debug my SQLs before committing them to ASP).Thanks for your answer.
@user89691 - Access SQL is not real SQL.
Cape Cod Gunny
@Guffa - Nice simple solution I'm upping your answer.
Cape Cod Gunny
+2  A: 

If you set your unpopulated fields to be NULL instead of blanks, you could rely on the fact that count() will not include NULL fields. All solutions with per-row function (if, case, coalesce and so on) are fine for small databases but will not scale well to big databases. Keep in mind that small is a relative term, it might still be okay for your databases even if you think they're big - I work in a shop where millions of rows are the sizes of our configuration tables :-)

Then you can just use:

select
    count(field1) as count1,
    count(field2) as count2,
    count(field3) as count3,
    count(field4) as count4
from ...

(or count(distinct fieldX) for distinct values, of course).

If that's a plausible way to go, you can just get your table set up with:

update tbl set field1 = NULL where field1 = '';
update tbl set field2 = NULL where field2 = '';
update tbl set field3 = NULL where field3 = '';
update tbl set field4 = NULL where field4 = '';

But, as with all database performance questions, measure, don't guess. And measure in the target environment (or suitable copy). And measure often. Database tuning is not a set-and-forget operation.

paxdiablo
I didn't think you could use count(field) without a group by, but I tried this and it works! I guess without a group by, the aggregating group is the whole table.
mdma
Yes, `group by` is actually an expander of detail. `select count(x) from tbl` will count the non-null values of x across the whole table (low detail level). `select count(x) from tbl group by y` will count the non-null values of x across each distinct y (higher detail level).
paxdiablo