We have a database setup that consists of two parts: a static structure, and dynamic additions. For each database, the dynamic can be different, and sometimes we don't have data for all the dynamic fields. Rigt now, we check for empties by looking at the total count of records in the entire table, but we want to move to a more refined method of checking for empties if possible. Is it possible to quickly check through several hundred fields and see which ones are empty and which ones are populated?
Assuming that you are just checking for whether or not there are any non-NULL values in the column, using EXISTS should generally be faster than getting a COUNT(*). The COUNT needs to scan the whole table to come up with the correct number. EXISTS just needs to find one row that satisfies the condition before it stops looking.
If the whole column is NULL then the time will be about the same, but in all of those cases where you have values it could be substantially shorter.
From Search all columns in all the tables in a database for a specific value
first create this function
CREATE PROCEDURE FindMyData_String
@DataToFind NVARCHAR(4000),
@ExactMatch BIT = 0
AS
SET NOCOUNT ON
DECLARE @Temp TABLE(RowId INT IDENTITY(1,1), SchemaName sysname, TableName sysname, ColumnName SysName, DataType VARCHAR(100), DataFound BIT)
INSERT INTO @Temp(TableName,SchemaName, ColumnName, DataType)
SELECT C.Table_Name,C.TABLE_SCHEMA, C.Column_Name, C.Data_Type
FROM Information_Schema.Columns AS C
INNER Join Information_Schema.Tables AS T
ON C.Table_Name = T.Table_Name
AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE Table_Type = 'Base Table'
And Data_Type In ('ntext','text','nvarchar','nchar','varchar','char')
DECLARE @i INT
DECLARE @MAX INT
DECLARE @TableName sysname
DECLARE @ColumnName sysname
DECLARE @SchemaName sysname
DECLARE @SQL NVARCHAR(4000)
DECLARE @PARAMETERS NVARCHAR(4000)
DECLARE @DataExists BIT
DECLARE @SQLTemplate NVARCHAR(4000)
SELECT @SQLTemplate = CASE WHEN @ExactMatch = 1
THEN 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
= ''' + @DataToFind + '''
)
Set @DataExists = 1
Else
Set @DataExists = 0'
ELSE 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
Like ''%' + @DataToFind + '%''
)
Set @DataExists = 1
Else
Set @DataExists = 0'
END,
@PARAMETERS = '@DataExists Bit OUTPUT',
@i = 1
SELECT @i = 1, @MAX = MAX(RowId)
FROM @Temp
WHILE @i <= @MAX
BEGIN
SELECT @SQL = REPLACE(REPLACE(@SQLTemplate, 'ReplaceTableName', QUOTENAME(SchemaName) + '.' + QUOTENAME(TableName)), 'ReplaceColumnName', ColumnName)
FROM @Temp
WHERE RowId = @i
PRINT @SQL
EXEC SP_EXECUTESQL @SQL, @PARAMETERS, @DataExists = @DataExists OUTPUT
IF @DataExists =1
UPDATE @Temp SET DataFound = 1 WHERE RowId = @i
SET @i = @i + 1
END
SELECT SchemaName,TableName, ColumnName
FROM @Temp
WHERE DataFound = 1
Now call it like this for rows with empty strings in any string type columns
exec FindMyData_String '',1
it will give you an output with column name, table name and schema name Just keep in mind that it will search all tables
For searching for any rows that have NULLS in any column you can do this, first create this proc which is based on the code here Search all columns in all the tables in a database for a specific value
CREATE PROCEDURE FindMyData_StringNull
@DataToFind NVARCHAR(4000),
@ExactMatch BIT = 0
AS
SET NOCOUNT ON
DECLARE @Temp TABLE(RowId INT IDENTITY(1,1), SchemaName sysname, TableName sysname, ColumnName SysName, DataType VARCHAR(100), DataFound BIT)
INSERT INTO @Temp(TableName,SchemaName, ColumnName, DataType)
SELECT C.Table_Name,C.TABLE_SCHEMA, C.Column_Name, C.Data_Type
FROM Information_Schema.Columns AS C
INNER Join Information_Schema.Tables AS T
ON C.Table_Name = T.Table_Name
AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE Table_Type = 'Base Table'
DECLARE @i INT
DECLARE @MAX INT
DECLARE @TableName sysname
DECLARE @ColumnName sysname
DECLARE @SchemaName sysname
DECLARE @SQL NVARCHAR(4000)
DECLARE @PARAMETERS NVARCHAR(4000)
DECLARE @DataExists BIT
DECLARE @SQLTemplate NVARCHAR(4000)
SELECT @SQLTemplate = 'If Exists(Select *
From ReplaceTableName
Where Convert(nVarChar(4000), [ReplaceColumnName])
IS NULL
)
Set @DataExists = 1
Else
Set @DataExists = 0'
,
@PARAMETERS = '@DataExists Bit OUTPUT',
@i = 1
SELECT @i = 1, @MAX = MAX(RowId)
FROM @Temp
WHILE @i <= @MAX
BEGIN
SELECT @SQL = REPLACE(REPLACE(@SQLTemplate, 'ReplaceTableName', QUOTENAME(SchemaName) + '.' + QUOTENAME(TableName)), 'ReplaceColumnName', ColumnName)
FROM @Temp
WHERE RowId = @i
PRINT @SQL
EXEC SP_EXECUTESQL @SQL, @PARAMETERS, @DataExists = @DataExists OUTPUT
IF @DataExists =1
UPDATE @Temp SET DataFound = 1 WHERE RowId = @i
SET @i = @i + 1
END
SELECT SchemaName,TableName, ColumnName
FROM @Temp
WHERE DataFound = 1
Call it like this
FindMyData_StringNull NULL,1
I would think the simplest solution is to use the CHECKSUM function. First you would want to determine the checksum on an empty row and then compare that to the other rows.
Select Checksum(*)
From Table
The catch with using *
here is that it will include the PK. You would likely have to specify the individual columns excluding the PK to get an accurate read. So something like:
Select Checksum(Col1, Col2, Col3)
From Table