Jim Horn
asked on
Single row for Count(column is NULL), Count(all rows)
Hi all
Does anyone have some quick and dirty (copyright jimpen) T-SQL to return a set with a single row that contains a count of all rows where a specific value is NULL (say column_name), and a count of the rows in the entire table (say table_name)?
I've put together the below CTE, which works fine, but it seems like there's a more elegant way to do this that I'm not grasping.
Thanks in advance.
Jim
Does anyone have some quick and dirty (copyright jimpen) T-SQL to return a set with a single row that contains a count of all rows where a specific value is NULL (say column_name), and a count of the rows in the entire table (say table_name)?
I've put together the below CTE, which works fine, but it seems like there's a more elegant way to do this that I'm not grasping.
Thanks in advance.
Jim
;
WITH m as (
SELECT 'Account' as label, COUNT(id) as row_count_column_name_missing
FROM table_name
WHERE column_name IS NULL)
, a as (
SELECT 'Account' as label, COUNT(id) as row_count_all
FROM table_name)
SELECT
m.label,
m.row_count_column_name_missing,
a.row_count_all,
CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4)) * 100 as numeric(5,2)) as pct_missing
FROM m
JOIN a ON m.label = a.label
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Too many irons in the fire this morning, Thanks for tidying it up Scott! :-)
ASKER
Simpler yes, but on a table with 1m rows this takes 7 seconds vs. 1 second for the CTE approach.
Tinkering, tinkering..
Tinkering, tinkering..
Sorry JIm, I wasn't 'reading into' the question.
select 'account' as label, count(*) as row_count_all, count(column_name) as row_count_column_name_miss ing,
cast((count(column_name)/c ount(*)) * 100 as numeric(5,2)) as pct_missing
from yourtable;
cast((count(column_name)/c
from yourtable;
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Actually, I guess row_count_column_name_miss ing should be count(*) - count(column_name).
ASKER
Cleared the cache, set statistics io and time on, and reran both my proposed and the new code here.
New code is half the elapsed time and simpler, so I'll go with that.
Code #1 - My original
Code #2 - Proposed here, with minor modificaitons
Results #2
New code is half the elapsed time and simpler, so I'll go with that.
Code #1 - My original
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON
;
WITH m as (
SELECT 'Account' as label, COUNT(id) as row_count_column_name_missing
FROM SF_Account_1
WHERE Address1_BioIQ__c IS NULL)
, a as (
SELECT 'Account' as label, COUNT(id) as row_count_all
FROM SF_Account_1)
SELECT
m.label,
m.row_count_column_name_missing,
a.row_count_all,
-- CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4))) * 100 as numeric(5,2)) as pct_missing
100 - CAST(m.row_count_column_name_missing / CAST(row_count_all as numeric(19,4)) * 100 as numeric(5,2))
FROM m
JOIN a ON m.label = a.label
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
Results #1
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(1 row(s) affected)
Table 'SF_Account_1'. Scan count 34, logical reads 636238, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2651 ms, elapsed time = 577 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Code #2 - Proposed here, with minor modificaitons
DBCC FREEPROCCACHE
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT
a.col_name_null_count,
a.total_rows_in_table,
a.col_name_null_count / CAST(a.total_rows_in_table as numeric(19,4)) as pct_missing
FROM (
select
SUM(CASE WHEN Address1_BioIQ__c IS NULL then 1 else 0 end ) AS col_name_null_count,
COUNT(*) as total_rows_in_table
FROM SF_Account_1) a
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
Results #2
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(1 row(s) affected)
Table 'SF_Account_1'. Scan count 17, logical reads 318119, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2701 ms, elapsed time = 256 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.