Link to home
Start Free TrialLog in
Avatar of David C
David CFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Find and list column names that are NULL or blank

Hi Experts, I am looking for ideas on a query that will allow me to find the names of columns that are blank or NULL and the id is any given number.
Avatar of Ganapathi
Ganapathi
Flag of India image

If i underdtand correctly, you want to know the columns name from a particular table where the column has NULL/Blank in all the rows. Am I right?
Avatar of David C

ASKER

No, in any row. For example if I have a student and their record does not have a phone number and email but has an address I want the query to return Phone Number and Email.
Try this.

CREATE TABLE STUDENT(SID INT, SNAME VARCHAR(30),EMAIL VARCHAR(50) NULL, PHONE VARCHAR(10) NULL)

INSERT INTO STUDENT SELECT 1,'PETER','peter@gmail.com','9167454542'
INSERT INTO STUDENT SELECT 2,'GEORGE','geo@gmail.com',NULL
INSERT INTO STUDENT SELECT 3,'LEO',NULL,'9167454542'
INSERT INTO STUDENT SELECT 4,'HARRY',NULL,''

SELECT 'StudentName' FROM STUDENT WHERE SID = 4 AND ISNULL(RTRIM(SNAME),'') = '' UNION ALL
SELECT 'Email' FROM STUDENT WHERE SID = 4 AND ISNULL(RTRIM(EMAIL),'') = '' UNION ALL
SELECT 'Phone' FROM STUDENT WHERE SID = 4 AND ISNULL(RTRIM(PHONE),'') = ''

Open in new window

Avatar of David C

ASKER

Thanks, this works great however the columns will not be hardcoded since I have quite a few. I need a query that does it dynamically. I had got this far

declare @col varchar(255), @cmd varchar(max)

DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'STUDENT'

OPEN getinfo

FETCH NEXT FROM getinfo into @col

declare @ypid varchar(max)

WHILE @@FETCH_STATUS = 0
BEGIN

    SELECT @cmd = 'IF NOT EXISTS (SELECT * FROM STUDENT WHERE [' + @col + '] IS NOT NULL and SID=4) BEGIN print ''' + @col + ''' end'
	EXEC(@cmd)

    FETCH NEXT FROM getinfo into @col
END

CLOSE getinfo
DEALLOCATE getinfo

Open in new window


but the issue is I cannot declare a variable for SID and I cannot check if the column is <> '' because the apostrophes are breaking the synax.

The idea query is

SELECT @cmd = 'IF NOT EXISTS (SELECT * FROM STUDENT WHERE [' + @col + '] IS NOT NULL or [' + @col + '] <>'' ) and SID=@sid) BEGIN print ''' + @col + ''' end'

Open in new window


but I cannot seem to get it working
Okay. If without declaring a variable, then you will have to hard code the SID or it will traverse all the records in the table.
Avatar of ste5an
E.g.

DECLARE @select_statement NVARCHAR(MAX);

WITH    user_tables_with_nullable_columns
          AS ( SELECT   T.object_id ,
                        QUOTENAME(S.name) + '.' + QUOTENAME(T.name) AS object_name ,
                        QUOTENAME(C.name) AS column_name
               FROM     sys.schemas S
                        INNER JOIN sys.tables T ON S.schema_id = T.schema_id
                        INNER JOIN sys.columns C ON T.object_id = C.object_id
               WHERE    T.type_desc = 'USER_TABLE'
                        AND C.is_nullable = 1
             ),
        column_selects
          AS ( SELECT   object_id ,
                        object_name ,
                        'SELECT ''' + object_name + ''' AS object_name, ''' + column_name + ''' AS column_name FROM ' + object_name + ' WHERE ' + column_name
                        + ' IS NULL' AS select_statement
               FROM     user_tables_with_nullable_columns
             )
    SELECT  @select_statement = 'SELECT Q.object_name, Q.column_name, COUNT(*) as nulls_in_column FROM ( ' + 
                STUFF(( SELECT ' UNION ALL ' + select_statement
                FROM   column_selects
                FOR XML PATH('')
        ), 1, 11, '')
            + ' ) Q GROUP BY Q.object_name, Q.column_name ;';

EXECUTE(@select_statement );

Open in new window

rather than run one query for every field, why not run one query with one case expression per field? This assume [SID] is the column that identifies a row in the final output (and that bit is hardcoded) all other fields are dynamically handled.
    CREATE TABLE STUDENT(SID INT, SNAME VARCHAR(30),EMAIL VARCHAR(50) NULL, PHONE VARCHAR(10) NULL)
    
    INSERT INTO STUDENT SELECT 1,'PETER','peter@gmail.com','9167454542'
    INSERT INTO STUDENT SELECT 2,'GEORGE','geo@gmail.com',NULL
    INSERT INTO STUDENT SELECT 3,'LEO',NULL,'9167454542'
    INSERT INTO STUDENT SELECT 4,'HARRY',NULL,''

**Query 1**:

    DECLARE @sql AS nvarchar(MAX)
    DECLARE @flds AS nvarchar(MAX)
    
    
    SET @sql = ''
    
    
    + '  SELECT'
    + '         ''STUDENT'' as tname'
    + '       , sid AS row_identity'
    + '       , COLUMN_NAME'
    + '       , ISNULL'
    + '  FROM STUDENT'
    + '  CROSS apply ('
    + '              VALUES'
    
    SELECT
        @flds = stuff((
                        SELECT ', (''' + c.name + ''', CASE WHEN ' + c.name + ' IS NULL THEN 1 ELSE 0 END)'
                        FROM sys.columns AS c
                        WHERE t.Object_ID = c.Object_ID
                        FOR XML PATH ('')
                       ), 1, 1, '')
    FROM sys.tables t
    WHERE t.name = 'STUDENT'
    
    SET @sql = @sql
                + @flds
                + '              ) ca1 (COLUMN_NAME, ISNULL)'
                + '  WHERE ca1.ISNULL = 1'
    
    
    
    --SELECT @sql
    
    EXEC(@sql)
    	
    	
    

**[Results][2]**:
    
    |   TNAME | ROW_IDENTITY | COLUMN_NAME | ISNULL |
    |---------|--------------|-------------|--------|
    | STUDENT |            2 |       PHONE |      1 |
    | STUDENT |            3 |       EMAIL |      1 |
    | STUDENT |            4 |       EMAIL |      1 |



  [1]: http://sqlfiddle.com/#!3/90136d/2

Open in new window

BTW: the generated query looks like the following. It uses a technique employing VALUES to transform one row into many rows - here one row per field
SELECT
     'STUDENT' AS tname
    , sid AS row_identity
    , COLUMN_NAME
    , ISNULL
FROM STUDENT
CROSS APPLY (
    VALUES
          ( 'SID'    , CASE WHEN SID IS NULL THEN 1 ELSE 0 END    )
        , ( 'SNAME'  , CASE WHEN SNAME IS NULL THEN 1 ELSE 0 END  )
        , ( 'EMAIL'  , CASE WHEN EMAIL IS NULL THEN 1 ELSE 0 END  )
        , ( 'PHONE' , CASE WHEN PHONE IS NULL THEN 1 ELSE 0 END   )
    ) ca1(COLUMN_NAME, ISNULL)
WHERE ca1.ISNULL = 1

Open in new window

Avatar of David C

ASKER

Thank you all so much for your feedback. I have managed to get it working using,

declare @col varchar(255), @cmd varchar(max)

DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'STUDENT'

OPEN getinfo

FETCH NEXT FROM getinfo into @col

WHILE @@FETCH_STATUS = 0

BEGIN
    SELECT @cmd = 'IF EXISTS (SELECT * FROM STUDENT WHERE ([' + @col + '] IS NULL or [' + @col + '] = '''' ) and SID=4) BEGIN print ''' + @col + ''' end'
	--SELECT @cmd
	EXEC(@cmd)

    FETCH NEXT FROM getinfo into @col
END

CLOSE getinfo
DEALLOCATE getinfo

Open in new window


Is there anything I can do to change SID=4 to use a parameter instead? This way I can put this as a stored procedure.
I am sure there has to be a better way of writing that code, but to answer your question: Is there anything I can do to change SID=4 to use a parameter instead? Something like this should do it:
CREATE PROCEDURE usp_YourStoredProcedureNameGoesHere
		@SID integer

AS

declare @col varchar(255), @cmd varchar(max)

DECLARE getinfo cursor for
SELECT c.name FROM sys.tables t JOIN sys.columns c ON t.Object_ID = c.Object_ID
WHERE t.Name = 'STUDENT'

OPEN getinfo

FETCH NEXT FROM getinfo into @col

WHILE @@FETCH_STATUS = 0

BEGIN
    SELECT @cmd = 'IF EXISTS (SELECT * FROM STUDENT WHERE ([' + @col + '] IS NULL or [' + @col + '] = '''' ) and SID=' + CAST(@SID AS varchar(10)) + ' BEGIN print ''' + @col + ''' end'
	--SELECT @cmd
	EXEC(@cmd)

    FETCH NEXT FROM getinfo into @col
END

CLOSE getinfo
DEALLOCATE getinfo

Open in new window

Avatar of David C

ASKER

Thank Anthony, I am getting the error
Incorrect syntax near the keyword 'BEGIN'.

Open in new window


I am happy to take suggestions as I want the best and most efficient code. Please let me know yours and I'll be happy to test it out
@takwirirar, I think you should use an entire dynamic approach and avoid the cursor. E.g. a shorter version of my first version:

DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM [dbo].[STUDENT] WHERE @column_name IS NULL';
DECLARE @result_statement NVARCHAR(MAX);

WITH    user_tables_with_nullable_columns
          AS ( SELECT   QUOTENAME(C.name) AS column_name
               FROM     sys.schemas S
                        INNER JOIN sys.tables T ON S.schema_id = T.schema_id
                        INNER JOIN sys.columns C ON T.object_id = C.object_id
               WHERE    C.is_nullable = 1
			AND S.name = 'dbo'
                        AND T.Name = 'STUDENT'
             ),
        column_selects
          AS ( SELECT   REPLACE(@test_statement, '@column_name', column_name) AS select_statement
               FROM     user_tables_with_nullable_columns
             )
    SELECT  @result_statement = STUFF(
					( 
						SELECT  ' UNION ALL ' + select_statement
						FROM    column_selects
						FOR XML PATH('')
					)
					, 1, 11, '');             

EXECUTE  ( @result_statement );

Open in new window

Avatar of David C

ASKER

Hi ste5an I have changed the test statement to

DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM @object_name WHERE (@column_name IS NULL or @column_name='''') and SID=4';

Open in new window


This allows me to get columns that are NULL or blank.

The results I get show as below. How do I change the query to get only the columns that are greater than zero(these are the ones I want)
User generated imageAlso how can I incorporate a parameter to get rid of SID=4
E.g.
DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM [dbo].[STUDENT] WHERE @column_name IS NULL HAVING COUNT(*) != 0 ';

Open in new window


By using a parameter??

DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM [dbo].[STUDENT] WHERE @column_name IS NULL AND SID = @SID HAVING COUNT(*) != 0 ';

-- the remainging code from above except the EXEC(variable).

EXECUTE sp_executesql @result_statement, N'@SID INT', @SID = 4;

Open in new window

Avatar of David C

ASKER

Thanks this is how I currently have it
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[sp_executesql] 
	
AS
BEGIN
	-- SET NOCOUNT ON added to prevent extra result sets from
	-- interfering with SELECT statements.
	SET NOCOUNT ON;

    -- Insert statements for procedure here
	DECLARE @test_statement NVARCHAR(MAX) = N'SELECT ''@column_name'' AS column_name, COUNT(*) FROM @object_name WHERE (@column_name IS NULL or @column_name='''') and SID=@SID HAVING COUNT(*) != 0';
DECLARE @result_statement NVARCHAR(MAX);

WITH    user_tables_with_nullable_columns
          AS ( SELECT   T.object_id ,
                        QUOTENAME(S.name) + '.' + QUOTENAME(T.name) AS object_name ,
                        QUOTENAME(C.name) AS column_name
               FROM     sys.schemas S
                        INNER JOIN sys.tables T ON S.schema_id = T.schema_id
                        INNER JOIN sys.columns C ON T.object_id = C.object_id
               WHERE    T.type_desc = 'USER_TABLE'
                        AND C.is_nullable = 1
                        AND T.Name = 'STUDENT'
             ),
        column_selects
          AS ( SELECT   REPLACE(REPLACE(@test_statement, '@object_name', object_name), '@column_name', column_name) AS select_statement
               FROM     user_tables_with_nullable_columns
             )
    SELECT  @result_statement = STUFF(
										( 
											SELECT  ' UNION ALL ' + select_statement
											FROM    column_selects
											FOR XML PATH('')
										)
										, 1, 11, '');             

EXECUTE  ( @result_statement );
END

Open in new window


When I run
EXECUTE sp_executesql @result_statement, N'@SID INT', @SID = 4;

Open in new window


It errors with

Msg 137, Level 15, State 2, Line 1
Must declare the scalar variable "@result_statement".

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of ste5an
ste5an
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of David C

ASKER

Thanks for your help and patience. I am now getting

Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".
Msg 1087, Level 15, State 2, Line 1
Must declare the table variable "@object_name".

Open in new window

Thanks for your help and patience. I am now getting
Whose solution are you using?  The last one posted by ste5an does not include a variable called @object_name
Avatar of David C

ASKER

Thanks this works! Was getting a few other errors because of confusion with other stored procedures