Single row for Count(column is NULL), Count(all rows)

Hi all

Does anyone have some quick and dirty (copyright jimpen) T-SQL to return a set with a single row that contains a count of all rows where a specific value is NULL (say column_name), and a count of the rows in the entire table (say table_name)?

I've put together the below CTE, which works fine, but it seems like there's a more elegant way to do this that I'm not grasping.

Thanks in advance.
Jim

;
WITH m as (
	SELECT 'Account' as label, COUNT(id) as row_count_column_name_missing
	FROM table_name
	WHERE column_name IS NULL)
, a as (	
	SELECT 'Account' as label, COUNT(id) as row_count_all
	FROM table_name) 
SELECT 
	m.label, 
	m.row_count_column_name_missing, 
	a.row_count_all,
	CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4)) * 100 as numeric(5,2)) as pct_missing
FROM m
	JOIN a ON m.label = a.label

Open in new window

LVL 66
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorAsked:
Who is Participating?
 
Scott PletcherSenior DBACommented:
>> Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach. <<

That's odd, because the CTEs should scan to the table twice, the query only once.  Perhaps if the other query is run first and the CTE is run second and uses what's already in the buffers.

You can simplify the query to:
select count(*), count(column_name)
since count will ignore nulls anyway.  Perhaps, maybe, an old optimizer might scan the table twice for the "case(...)" version.

Better than time is to look at the query plan and/or compare logical I/O counts:
SET STATISTICS IO ON
before running the queries.

Be sure to ignore the first run no matter which method is used.  That allows the rows to get into buffers.

Then you can compare I/O and even times, although elapsed time is affected by many things and thus doesn't necessarily directly indicate the overhead in a given query.
0
 
RayData AnalystCommented:
Not sure I'd call this elegant, but simpler for sure.

I didn't have time to test this, so the case statement may need a slight syntax adjustment.

select  sum(case when Col_Name NULL then 1 else 0 end ) ,  count(*)
from  Table_Name
0
 
Scott PletcherSenior DBACommented:
Yep, with a very slight syntax change (no pts for me please):

select  sum(case when Col_Name IS NULL then 1 else 0 end ) AS col_name_null_count,  count(*) as total_rows_in_table
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
RayData AnalystCommented:
Too many irons in the fire this morning, Thanks for tidying it up Scott! :-)
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorAuthor Commented:
Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach.

Tinkering, tinkering..
0
 
RayData AnalystCommented:
Sorry JIm, I wasn't 'reading into' the question.
0
 
awking00Commented:
select 'account' as label, count(*) as row_count_all, count(column_name) as row_count_column_name_missing,
cast((count(column_name)/count(*)) * 100 as numeric(5,2)) as pct_missing
from yourtable;
0
 
Vitor MontalvãoMSSQL Senior EngineerCommented:
Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach.
Strange. You're sure that isn't a cache issue?
0
 
awking00Commented:
Actually, I guess row_count_column_name_missing should be count(*) - count(column_name).
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorAuthor Commented:
Cleared the cache, set statistics io and time on, and reran both my proposed and the new code here.
New code is half the elapsed time and simpler, so I'll go with that.

Code #1 - My original
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON                     
;
WITH m as (
	SELECT 'Account' as label, COUNT(id) as row_count_column_name_missing
	FROM SF_Account_1
	WHERE Address1_BioIQ__c IS NULL)
, a as (	
	SELECT 'Account' as label, COUNT(id) as row_count_all
	FROM SF_Account_1) 
SELECT 
	m.label, 
	m.row_count_column_name_missing, 
	a.row_count_all,
	-- CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4))) * 100 as numeric(5,2)) as pct_missing
	100 - CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4)) * 100 as numeric(5,2))
FROM m
	JOIN a ON m.label = a.label

SET STATISTICS IO OFF
SET STATISTICS TIME OFF

Open in new window

Results #1
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1 row(s) affected)
Table 'SF_Account_1'. Scan count 34, logical reads 636238, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 2651 ms,  elapsed time = 577 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Open in new window


Code #2 - Proposed here, with minor modificaitons
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON  
 
SELECT 
	a.col_name_null_count, 
	a.total_rows_in_table, 
	a.col_name_null_count / CAST(a.total_rows_in_table as numeric(19,4)) as pct_missing
FROM (
	select 
		SUM(CASE WHEN Address1_BioIQ__c IS NULL then 1 else 0 end ) AS col_name_null_count,  
		COUNT(*) as total_rows_in_table
	FROM SF_Account_1) a
                           
SET STATISTICS IO OFF
SET STATISTICS TIME OFF

Open in new window


Results #2
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1 row(s) affected)
Table 'SF_Account_1'. Scan count 17, logical reads 318119, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 2701 ms,  elapsed time = 256 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Open in new window

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.