tags:

views:

71

answers:

5

I have around 30 columns in a MySQL table. I want to calculate how many column fields for a particular row are empty. This table is for storing user information. I want to find out how many profile fields (such as 'Name', 'Age', 'Location' - all of which are stored in a separate column) are empty/haven't been filled in by the user.

The columns I want to refine down are the final 20 columns (because the first 10 columns store stuff like User IDs, passwords, etc. - I only want the final 20 columns which store profile information). How do I find out the AMOUNT of empty/NULL columns in this table?

+1  A: 

Hello Jamie,

You can compute it like this:

SELECT SUM((`Name` = '') + (`Age` = 0) + (`Location` = '' OR `Location` IS NULL) + ...)

You didn't specify what types of columns you have so I used different comparing methods to illustrate the point. As a general idea you should compare to see if a specific column value is equal to the default value for that field (i.e. field not specified by the user). Use = '' for strings, = 0 for numbers, IS NULL for columns that have default NULL etc. You could also combine checks for NULL and empty value if you want. It all depends on what you want to find out.

Of course, as dnagirl pointed out in her answer, you should test for both empty and NULL values, but that really depends on how your columns are defined and also if you consider empty values to be filled or not. The point I was trying to make is that you can use SUM to add up boolean expression results.

EDIT: Didn't notice that you wanted for each row. Just remove the SUM and you'll get it per row.

Alin Purcaru
Is there any way to automate this? Otherwise I'll have to list over twenty column names to find NULL values
Jamie
@Jamie: if you're going to run this again and again, write a stored procedure. Yes, you'll still have to list all the columns, but you'll only have to type it once.
dnagirl
+3  A: 

If the idea is to get a result something like this:

col       emp    
------    ------    
FName        15  
LName         2  
Age          22 

..use:

SELECT 'FName' AS col, SUM(CASE FName IS NULL || FName='' THEN 1 ELSE 0 END) as emp FROM MyTable
UNION
SELECT 'LName' AS col, SUM(CASE LName IS NULL || LName='' THEN 1 ELSE 0 END) as emp FROM MyTable
UNION
SELECT 'Age' AS col, SUM(CASE Age IS NULL || Age='' THEN 1 ELSE 0 END) as emp FROM MyTable

...or:

SELECT SUM(CASE t.fname IS NULL OR t.fname = '' THEN 1 ELSE 0 END) AS fname_count,
       SUM(CASE t.lname IS NULL OR t.lname = '' THEN 1 ELSE 0 END) AS lname_count,
       SUM(CASE t.age IS NULL OR t.age = '' THEN 1 ELSE 0 END) AS age_count
  FROM MYTABLE t
dnagirl
+1: But you could use a single statement, just alias the column with the name of the column being checked.
OMG Ponies
Thats exactly how I would have handled similar conditional testing... Not just an empty value, but explicit test for NULL too
DRapp
"for a particular row" - This will show the totals for all rows.
geon
I want to find out the amount of columns with NULL or a value of '' for a particular row (i.e. for one user)
Jamie
@OMG Ponies: I know I'm going to kick myself I see what you mean, but I'm having a total brain blockage here. Could you post an example, please?
dnagirl
@dnagirl: See my edit to your answer
OMG Ponies
@OMG Ponies: here's me kicking myself. yurghh. Thanks!
dnagirl
You had the meat of the query, that's what really matters.
OMG Ponies
+1  A: 

In addition to what @dnagirl answered, I would consider doing an extra step... adding a column to the table for "anyNulls". And set it as a flag for ANY empty/null values, so you would be able to quickly query those that need to be completed for data requirements instead of just how many of each category. One record could have all fields missing vs one column in multiple rows that need to be fixed.

DRapp
A: 

The method Alin suggested is elegant but it suits the occasion only if the result set contains a single row. Another and a rather hackish option would be

SELECT

   (@c := 0) AS cnt0, 
   (@c := @c + IF(col1 IS NULL || col1 = '', 1, 0)) AS cnt1, 
   (@c := @c + IF(col2 IS NULL || col2 = '', 1, 0)) AS cnt2,
   (@c := @c + IF(col3 IS NULL || col3 = '', 1, 0)) AS cnt3

FROM

   table

So the last cnt column would contain the number of empty columns in each row, depending on how many columns you'll actually check.

Saul
Why would my method work for just one single row? SUM is an aggregate function.
Alin Purcaru
@Alin Purcaru: Without a GROUP BY clause your query returns the count for all rows. If I understood the question correctly, Jamie wanted the count to be per row.
Saul
Oh. Didn't notice that.
Alin Purcaru
A: 

As I understand it you have a table such as

id | FName | LName  | Age
1  | John  | ""     | NULL
2  | Mary  | Simons | NULL

And you want: row 1 has 2 empty/null fields, and row 2 has 1 empty/null fields

SELECT
    id,

    IF (FName IS NULL OR FName = '', 1, 0) + 
    IF (LName IS NULL OR LName = '', 1, 0) + 
    IF (Age IS NULL OR Age = '', 1, 0)
    as empty_field_count

Though now that I've written this I see Saul's response, I think that may have better performance that this solution.

Sid