Solved

Single row for Count(column is NULL), Count(all rows)

Posted on 2014-12-22
10
132 Views
Last Modified: 2015-01-05
Hi all

Does anyone have some quick and dirty (copyright jimpen) T-SQL to return a set with a single row that contains a count of all rows where a specific value is NULL (say column_name), and a count of the rows in the entire table (say table_name)?

I've put together the below CTE, which works fine, but it seems like there's a more elegant way to do this that I'm not grasping.

Thanks in advance.
Jim

;
WITH m as (
	SELECT 'Account' as label, COUNT(id) as row_count_column_name_missing
	FROM table_name
	WHERE column_name IS NULL)
, a as (	
	SELECT 'Account' as label, COUNT(id) as row_count_all
	FROM table_name) 
SELECT 
	m.label, 
	m.row_count_column_name_missing, 
	a.row_count_all,
	CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4)) * 100 as numeric(5,2)) as pct_missing
FROM m
	JOIN a ON m.label = a.label

Open in new window

0
Comment
Question by:Jim Horn
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +2
10 Comments
 
LVL 10

Assisted Solution

by:Ray
Ray earned 150 total points
ID: 40513198
Not sure I'd call this elegant, but simpler for sure.

I didn't have time to test this, so the case statement may need a slight syntax adjustment.

select  sum(case when Col_Name NULL then 1 else 0 end ) ,  count(*)
from  Table_Name
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 275 total points
ID: 40513211
Yep, with a very slight syntax change (no pts for me please):

select  sum(case when Col_Name IS NULL then 1 else 0 end ) AS col_name_null_count,  count(*) as total_rows_in_table
0
 
LVL 10

Expert Comment

by:Ray
ID: 40513214
Too many irons in the fire this morning, Thanks for tidying it up Scott! :-)
0
Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

 
LVL 65

Author Comment

by:Jim Horn
ID: 40513235
Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach.

Tinkering, tinkering..
0
 
LVL 10

Expert Comment

by:Ray
ID: 40513311
Sorry JIm, I wasn't 'reading into' the question.
0
 
LVL 32

Expert Comment

by:awking00
ID: 40513482
select 'account' as label, count(*) as row_count_all, count(column_name) as row_count_column_name_missing,
cast((count(column_name)/count(*)) * 100 as numeric(5,2)) as pct_missing
from yourtable;
0
 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 275 total points
ID: 40514051
>> Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach. <<

That's odd, because the CTEs should scan to the table twice, the query only once.  Perhaps if the other query is run first and the CTE is run second and uses what's already in the buffers.

You can simplify the query to:
select count(*), count(column_name)
since count will ignore nulls anyway.  Perhaps, maybe, an old optimizer might scan the table twice for the "case(...)" version.

Better than time is to look at the query plan and/or compare logical I/O counts:
SET STATISTICS IO ON
before running the queries.

Be sure to ignore the first run no matter which method is used.  That allows the rows to get into buffers.

Then you can compare I/O and even times, although elapsed time is affected by many things and thus doesn't necessarily directly indicate the overhead in a given query.
0
 
LVL 50

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 75 total points
ID: 40514619
Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach.
Strange. You're sure that isn't a cache issue?
0
 
LVL 32

Expert Comment

by:awking00
ID: 40514784
Actually, I guess row_count_column_name_missing should be count(*) - count(column_name).
0
 
LVL 65

Author Comment

by:Jim Horn
ID: 40531578
Cleared the cache, set statistics io and time on, and reran both my proposed and the new code here.
New code is half the elapsed time and simpler, so I'll go with that.

Code #1 - My original
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON                     
;
WITH m as (
	SELECT 'Account' as label, COUNT(id) as row_count_column_name_missing
	FROM SF_Account_1
	WHERE Address1_BioIQ__c IS NULL)
, a as (	
	SELECT 'Account' as label, COUNT(id) as row_count_all
	FROM SF_Account_1) 
SELECT 
	m.label, 
	m.row_count_column_name_missing, 
	a.row_count_all,
	-- CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4))) * 100 as numeric(5,2)) as pct_missing
	100 - CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4)) * 100 as numeric(5,2))
FROM m
	JOIN a ON m.label = a.label

SET STATISTICS IO OFF
SET STATISTICS TIME OFF

Open in new window

Results #1
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1 row(s) affected)
Table 'SF_Account_1'. Scan count 34, logical reads 636238, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 2651 ms,  elapsed time = 577 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Open in new window


Code #2 - Proposed here, with minor modificaitons
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON  
 
SELECT 
	a.col_name_null_count, 
	a.total_rows_in_table, 
	a.col_name_null_count / CAST(a.total_rows_in_table as numeric(19,4)) as pct_missing
FROM (
	select 
		SUM(CASE WHEN Address1_BioIQ__c IS NULL then 1 else 0 end ) AS col_name_null_count,  
		COUNT(*) as total_rows_in_table
	FROM SF_Account_1) a
                           
SET STATISTICS IO OFF
SET STATISTICS TIME OFF

Open in new window


Results #2
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(1 row(s) affected)
Table 'SF_Account_1'. Scan count 17, logical reads 318119, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 2701 ms,  elapsed time = 256 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Open in new window

0

Featured Post

Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
store vs query adhoc - no show rows 4 37
Any benefit to adding a Clustered index here? 4 38
invoke-sqlcmd help 5 33
Section based report in SSRS 14 32
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Viewers will learn how the fundamental information of how to create a table.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question