views:

57

answers:

4

Is there a SQL way to find out which columns in my schema are completely full of null values? There are some fields in a couple of tables that I know are not used by the application and will be dropped, but I wanted to see if there was an automated way/script to find this out across the whole database to find candidates for code review/possible removal.

Running SQL Server 2005 on x86 if it matters.

Thanks in advance!

A: 

you can return the max(column) and check for nulls

Beth
A: 

You can look for zeroes in the results below.

 EXEC sp_MSforeachtable N'
DECLARE @collist NVARCHAR(MAX)
DECLARE @wherelist NVARCHAR(MAX)
DECLARE @dynsql NVARCHAR(MAX)
SELECT 
     @collist = ISNULL(@collist + '','','''' ) + ''COUNT('' + QUOTENAME(name) + '') AS '' + QUOTENAME(name),
     @wherelist = ISNULL(@wherelist + '' or '','''' ) + ''COUNT('' + QUOTENAME(name) + '') = 0 ''
FROM sys.columns WHERE is_nullable=1 and object_id=OBJECT_ID(''?'')
IF @collist IS NOT NULL
BEGIN
SET @dynsql = ''SELECT ''''?'''' As TableName, count(*) AS numrows, '' + @collist + '' FROM ? HAVING '' + @wherelist
PRINT @dynsql
EXEC (@dynsql)
END
 '
Martin Smith
+1  A: 
create table #SuspectColumns (
    TABLE_SCHEMA sysname,
    TABLE_NAME sysname,
    COLUMN_NAME sysname
)

declare csrColumns cursor fast_forward for
    select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME
        from INFORMATION_SCHEMA.COLUMNS
        where IS_NULLABLE = 'YES'

declare @TABLE_SCHEMA sysname,
        @TABLE_NAME sysname,
        @COLUMN_NAME sysname,
        @sql nvarchar(max)  

open csrColumns

while (1=1) begin
    fetch next
        from csrColumns
        into @TABLE_SCHEMA, @TABLE_NAME, @COLUMN_NAME

    if @@FETCH_STATUS<>0 break

    set @sql = N'if not exists(select 1 from ' + QUOTENAME(@TABLE_SCHEMA) + N'.' + QUOTENAME(@TABLE_NAME) + N' where ' + QUOTENAME(@COLUMN_NAME) + N'is not null)
                     insert into #SuspectColumns values (''' + @TABLE_SCHEMA + N''',''' + @TABLE_NAME + N''',''' + @COLUMN_NAME + N''')'

    exec sp_executesql @sql
end /* while */

close csrColumns
deallocate csrColumns

select * from #SuspectColumns

drop table #SuspectColumns
Joe Stefanelli
Wow! This worked perfectly as written and was much more useful than I anticipated as an analysis tool of how we're using our schema (or not using it, as this case may be). Ran super fast in 1.145 seconds on first run, 0.08 seconds on second run across 55 tables with a total of 616,000 rows.
Mark A
A: 

Off the top of my head, I believe the following SQL should work. It will execute a query for each table/column combination and the query will return the table name and column name if that table/column combination either has no rows or all null rows.

DECLARE @table_columns TABLE
(
  table_name nvarchar(128),
  column_name nvarchar(128)
);
DECLARE @table_name nvarchar(128);
DECLARE @column_name nvarchar(128);

INSERT INTO @table_columns(table_name, column_name)
select TABLE_NAME, COLUMN_NAME from INFORMATION_SCHEMA.columns;

while (select count(*) from @table_columns) > 0
begin
    select top 1 @table_name = table_name, @column_name = column_name from @table_columns
    exec('SELECT ''' + @table_name + ''' as table_name, ''' + @column_name + ''' as column_name WHERE NOT EXISTS (SELECT TOP 1 * FROM ' + @table_name + ' WHERE ' + @column_name + ' IS NOT NULL)')
    delete from @table_columns where table_name = @table_name and column_name = @column_name 
end
NullPointerException
If you have tables/columns with non standard names (e.g. `fo]'o`) or in different schemas this script won't work.
Martin Smith