David C
asked on
Find and list column names that are NULL or blank
Hi Experts, I am looking for ideas on a query that will allow me to find the names of columns that are blank or NULL and the id is any given number.
If i underdtand correctly, you want to know the columns name from a particular table where the column has NULL/Blank in all the rows. Am I right?
ASKER
No, in any row. For example if I have a student and their record does not have a phone number and email but has an address I want the query to return Phone Number and Email.
Try this.
CREATE TABLE STUDENT(SID INT, SNAME VARCHAR(30),EMAIL VARCHAR(50) NULL, PHONE VARCHAR(10) NULL)
INSERT INTO STUDENT SELECT 1,'PETER','peter@gmail.com','9167454542'
INSERT INTO STUDENT SELECT 2,'GEORGE','geo@gmail.com',NULL
INSERT INTO STUDENT SELECT 3,'LEO',NULL,'9167454542'
INSERT INTO STUDENT SELECT 4,'HARRY',NULL,''
SELECT 'StudentName' FROM STUDENT WHERE SID = 4 AND ISNULL(RTRIM(SNAME),'') = '' UNION ALL
SELECT 'Email' FROM STUDENT WHERE SID = 4 AND ISNULL(RTRIM(EMAIL),'') = '' UNION ALL
SELECT 'Phone' FROM STUDENT WHERE SID = 4 AND ISNULL(RTRIM(PHONE),'') = ''
ASKER
Thanks, this works great however the columns will not be hardcoded since I have quite a few. I need a query that does it dynamically. I had got this far
but the issue is I cannot declare a variable for SID and I cannot check if the column is <> '' because the apostrophes are breaking the synax.
The idea query is
but I cannot seem to get it working
declare @col varchar(255), @cmd varchar(max)
DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'STUDENT'
OPEN getinfo
FETCH NEXT FROM getinfo into @col
declare @ypid varchar(max)
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @cmd = 'IF NOT EXISTS (SELECT * FROM STUDENT WHERE [' + @col + '] IS NOT NULL and SID=4) BEGIN print ''' + @col + ''' end'
EXEC(@cmd)
FETCH NEXT FROM getinfo into @col
END
CLOSE getinfo
DEALLOCATE getinfo
but the issue is I cannot declare a variable for SID and I cannot check if the column is <> '' because the apostrophes are breaking the synax.
The idea query is
SELECT @cmd = 'IF NOT EXISTS (SELECT * FROM STUDENT WHERE [' + @col + '] IS NOT NULL or [' + @col + '] <>'' ) and SID=@sid) BEGIN print ''' + @col + ''' end'
but I cannot seem to get it working
Okay. If without declaring a variable, then you will have to hard code the SID or it will traverse all the records in the table.
E.g.
DECLARE @select_statement NVARCHAR(MAX);
WITH user_tables_with_nullable_columns
AS ( SELECT T.object_id ,
QUOTENAME(S.name) + '.' + QUOTENAME(T.name) AS object_name ,
QUOTENAME(C.name) AS column_name
FROM sys.schemas S
INNER JOIN sys.tables T ON S.schema_id = T.schema_id
INNER JOIN sys.columns C ON T.object_id = C.object_id
WHERE T.type_desc = 'USER_TABLE'
AND C.is_nullable = 1
),
column_selects
AS ( SELECT object_id ,
object_name ,
'SELECT ''' + object_name + ''' AS object_name, ''' + column_name + ''' AS column_name FROM ' + object_name + ' WHERE ' + column_name
+ ' IS NULL' AS select_statement
FROM user_tables_with_nullable_columns
)
SELECT @select_statement = 'SELECT Q.object_name, Q.column_name, COUNT(*) as nulls_in_column FROM ( ' +
STUFF(( SELECT ' UNION ALL ' + select_statement
FROM column_selects
FOR XML PATH('')
), 1, 11, '')
+ ' ) Q GROUP BY Q.object_name, Q.column_name ;';
EXECUTE(@select_statement );
rather than run one query for every field, why not run one query with one case expression per field? This assume [SID] is the column that identifies a row in the final output (and that bit is hardcoded) all other fields are dynamically handled.
CREATE TABLE STUDENT(SID INT, SNAME VARCHAR(30),EMAIL VARCHAR(50) NULL, PHONE VARCHAR(10) NULL)
INSERT INTO STUDENT SELECT 1,'PETER','peter@gmail.com','9167454542'
INSERT INTO STUDENT SELECT 2,'GEORGE','geo@gmail.com',NULL
INSERT INTO STUDENT SELECT 3,'LEO',NULL,'9167454542'
INSERT INTO STUDENT SELECT 4,'HARRY',NULL,''
**Query 1**:
DECLARE @sql AS nvarchar(MAX)
DECLARE @flds AS nvarchar(MAX)
SET @sql = ''
+ ' SELECT'
+ ' ''STUDENT'' as tname'
+ ' , sid AS row_identity'
+ ' , COLUMN_NAME'
+ ' , ISNULL'
+ ' FROM STUDENT'
+ ' CROSS apply ('
+ ' VALUES'
SELECT
@flds = stuff((
SELECT ', (''' + c.name + ''', CASE WHEN ' + c.name + ' IS NULL THEN 1 ELSE 0 END)'
FROM sys.columns AS c
WHERE t.Object_ID = c.Object_ID
FOR XML PATH ('')
), 1, 1, '')
FROM sys.tables t
WHERE t.name = 'STUDENT'
SET @sql = @sql
+ @flds
+ ' ) ca1 (COLUMN_NAME, ISNULL)'
+ ' WHERE ca1.ISNULL = 1'
--SELECT @sql
EXEC(@sql)
**[Results][2]**:
| TNAME | ROW_IDENTITY | COLUMN_NAME | ISNULL |
|---------|--------------|-------------|--------|
| STUDENT | 2 | PHONE | 1 |
| STUDENT | 3 | EMAIL | 1 |
| STUDENT | 4 | EMAIL | 1 |
[1]: http://sqlfiddle.com/#!3/90136d/2
BTW: the generated query looks like the following. It uses a technique employing VALUES to transform one row into many rows - here one row per field
SELECT
'STUDENT' AS tname
, sid AS row_identity
, COLUMN_NAME
, ISNULL
FROM STUDENT
CROSS APPLY (
VALUES
( 'SID' , CASE WHEN SID IS NULL THEN 1 ELSE 0 END )
, ( 'SNAME' , CASE WHEN SNAME IS NULL THEN 1 ELSE 0 END )
, ( 'EMAIL' , CASE WHEN EMAIL IS NULL THEN 1 ELSE 0 END )
, ( 'PHONE' , CASE WHEN PHONE IS NULL THEN 1 ELSE 0 END )
) ca1(COLUMN_NAME, ISNULL)
WHERE ca1.ISNULL = 1
ASKER
Thank you all so much for your feedback. I have managed to get it working using,
Is there anything I can do to change SID=4 to use a parameter instead? This way I can put this as a stored procedure.
declare @col varchar(255), @cmd varchar(max)
DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'STUDENT'
OPEN getinfo
FETCH NEXT FROM getinfo into @col
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @cmd = 'IF EXISTS (SELECT * FROM STUDENT WHERE ([' + @col + '] IS NULL or [' + @col + '] = '''' ) and SID=4) BEGIN print ''' + @col + ''' end'
--SELECT @cmd
EXEC(@cmd)
FETCH NEXT FROM getinfo into @col
END
CLOSE getinfo
DEALLOCATE getinfo
Is there anything I can do to change SID=4 to use a parameter instead? This way I can put this as a stored procedure.
I am sure there has to be a better way of writing that code, but to answer your question: Is there anything I can do to change SID=4 to use a parameter instead? Something like this should do it:
CREATE PROCEDURE usp_YourStoredProcedureNameGoesHere
@SID integer
AS
declare @col varchar(255), @cmd varchar(max)
DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'STUDENT'
OPEN getinfo
FETCH NEXT FROM getinfo into @col
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @cmd = 'IF EXISTS (SELECT * FROM STUDENT WHERE ([' + @col + '] IS NULL or [' + @col + '] = '''' ) and SID=' + CAST(@SID AS varchar(10)) + ' BEGIN print ''' + @col + ''' end'
--SELECT @cmd
EXEC(@cmd)
FETCH NEXT FROM getinfo into @col
END
CLOSE getinfo
DEALLOCATE getinfo
ASKER
Thank Anthony, I am getting the error
I am happy to take suggestions as I want the best and most efficient code. Please let me know yours and I'll be happy to test it out
Incorrect syntax near the keyword 'BEGIN'.
I am happy to take suggestions as I want the best and most efficient code. Please let me know yours and I'll be happy to test it out
@takwirirar, I think you should use an entire dynamic approach and avoid the cursor. E.g. a shorter version of my first version:
DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM [dbo].[STUDENT] WHERE @column_name IS NULL';
DECLARE @result_statement NVARCHAR(MAX);
WITH user_tables_with_nullable_columns
AS ( SELECT QUOTENAME(C.name) AS column_name
FROM sys.schemas S
INNER JOIN sys.tables T ON S.schema_id = T.schema_id
INNER JOIN sys.columns C ON T.object_id = C.object_id
WHERE C.is_nullable = 1
AND S.name = 'dbo'
AND T.Name = 'STUDENT'
),
column_selects
AS ( SELECT REPLACE(@test_statement, '@column_name', column_name) AS select_statement
FROM user_tables_with_nullable_columns
)
SELECT @result_statement = STUFF(
(
SELECT ' UNION ALL ' + select_statement
FROM column_selects
FOR XML PATH('')
)
, 1, 11, '');
EXECUTE ( @result_statement );
ASKER
Hi ste5an I have changed the test statement to
This allows me to get columns that are NULL or blank.
The results I get show as below. How do I change the query to get only the columns that are greater than zero(these are the ones I want)
Also how can I incorporate a parameter to get rid of SID=4
DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM @object_name WHERE (@column_name IS NULL or @column_name='''') and SID=4';
This allows me to get columns that are NULL or blank.
The results I get show as below. How do I change the query to get only the columns that are greater than zero(these are the ones I want)
Also how can I incorporate a parameter to get rid of SID=4
E.g.
By using a parameter??
DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM [dbo].[STUDENT] WHERE @column_name IS NULL HAVING COUNT(*) != 0 ';
By using a parameter??
DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM [dbo].[STUDENT] WHERE @column_name IS NULL AND SID = @SID HAVING COUNT(*) != 0 ';
-- the remainging code from above except the EXEC(variable).
EXECUTE sp_executesql @result_statement, N'@SID INT', @SID = 4;
ASKER
Thanks this is how I currently have it
When I run
It errors with
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[sp_executesql]
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM @object_name WHERE (@column_name IS NULL or @column_name='''') and SID=@SID HAVING COUNT(*) != 0';
DECLARE @result_statement NVARCHAR(MAX);
WITH user_tables_with_nullable_columns
AS ( SELECT T.object_id ,
QUOTENAME(S.name) + '.' + QUOTENAME(T.name) AS object_name ,
QUOTENAME(C.name) AS column_name
FROM sys.schemas S
INNER JOIN sys.tables T ON S.schema_id = T.schema_id
INNER JOIN sys.columns C ON T.object_id = C.object_id
WHERE T.type_desc = 'USER_TABLE'
AND C.is_nullable = 1
AND T.Name = 'STUDENT'
),
column_selects
AS ( SELECT REPLACE(REPLACE(@test_statement, '@object_name', object_name), '@column_name', column_name) AS select_statement
FROM user_tables_with_nullable_columns
)
SELECT @result_statement = STUFF(
(
SELECT ' UNION ALL ' + select_statement
FROM column_selects
FOR XML PATH('')
)
, 1, 11, '');
EXECUTE ( @result_statement );
END
When I run
EXECUTE sp_executesql @result_statement, N'@SID INT', @SID = 4;
It errors with
Msg 137, Level 15, State 2, Line 1
Must declare the scalar variable "@result_statement".
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks for your help and patience. I am now getting
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Thanks for your help and patience. I am now getting
Whose solution are you using? The last one posted by ste5an does not include a variable called @object_name
Whose solution are you using? The last one posted by ste5an does not include a variable called @object_name
ASKER
Thanks this works! Was getting a few other errors because of confusion with other stored procedures