Link to home
Start Free TrialLog in
Avatar of Wayne Burr
Wayne BurrFlag for United States of America

asked on

Best method to identify nulls in columns in all columns

I have a table with about 30+ columns.  Of all those columns I want to check every column to see if all the rows have nulls in them.
I want to display all the column names and if they were all nulls or not.

I started with a select statement with Case's but was not getting what I was hoping for so I figured there was a better way.

Any advice would be great!

Thanks in advance.

--Sample table
select AKAFirst,
AKAMiddle,
AKASurname,
AltAssess,
BirthCountry,
BirthPlace_City,
Birthplace_StateProvince,
CELDT_TestPurpose,
CMA_CSTTest,
CMA_ELA
from S_CA_STU_X;

--Display something like this:  (comments would not show ect)
AKAFirst    0               --this would be that all rows were null
AKAMiddle    8          --this would be 8 rows have data
AKASurname   15     --this wold be 15 rows have data
AltAssess   0              --this would be that all rows were null
...
Avatar of awking00
awking00
Flag of United States of America image

You might use a procedure like the following:
set serveroutput on
declare
v_sql varchar2(255);
v_cnt number;
begin
for c in
(select table_name, column_name
 from user_tab_columns
 where table_name = 'YOURTABLE'
 order by 1, 2)
loop
v_sql := 'select count('||c.column_name||') from '||c.table_name;
execute immediate v_sql into v_cnt;
dbms_output.put_line(c.column_name||'    '||v_cnt);
end loop;
end;
/
Avatar of Wayne Burr

ASKER

Thanks awking!

I dont think I want to go that route with a procedure if I can help it.  I am doing some testing on different machines and multiple databases. I have to port my query to each sid quickly, test, get results and get out of there ect.

This is what I came up with but worry that it might be much with 30+ columns:
select case when (select count("BIRTHCOUNTRY") from "PS"."S_CA_STU_X" group by "BIRTHCOUNTRY") > 1 then 1 else 0 end as Birthcountry
  from "PS"."S_CA_STU_X"
  group by Birthcountry;
So my other attempt did not work like I was hoping to either when I had values in the columns that were not null.   There was an error as I'm sure most of you would see why, so I changed it to this:  (But now I get multiple rows)
select case when (select count(akafirst) from "PS"."S_CA_STU_X" where akafirst is not null) > 1 then 1 else 0 end as AKAFirst,
  case when (select count(AKAMIDDLE) from "PS"."S_CA_STU_X" where akafirst is not null) > 1 then 1 else 0 end as AKAMIDDLE
  from "PS"."S_CA_STU_X"
  group by AKAFirst, AKAMIDDLE;

Data looks like this:
akafirst         akamiddle
1                     1
1                      1
1                      1
...
--for 300+ rows

I only want 1 row per column list to count the number of non nulls
ASKER CERTIFIED SOLUTION
Avatar of johnsone
johnsone
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you everybody!

Johnsone, Thank you  as this is exactly what I was looking for!

Laid it out and this is what it looks like:
select
sum(case when AKAFirst is Not null then 1 else 0 end) as AKAFirst,
 sum(case when AKAMIDDLE is Not null then 1 else 0 end) as AKAMIDDLE,
 sum(case when AKASURNAME is Not null then 1 else 0 end) as AKASURNAME,
 sum(case when BIRTHCOUNTRY is Not null then 1 else 0 end) as BIRTHCOUNTRY
 from "PS"."S_CA_STU_X" ;
The problem with trying to use a slelect is that your results will be horizontal like -
col1     col2      col3    .... col30+      
value   value   value   .... value
You might try using the procedure to load a table rather than printing out to the console.
Create table newtable(colname varchar2(30), notnulcount number);
Just change this
dbms_output.put_line(c.column_name||'    '||v_cnt);
to this
insert into newtable values(c.column_name,v_cnt);
Then you can query the table -
select * from newtable.
While I think the procedure way would work well and would get the results, I see two things.

A good thing, it is dynamic and will get the column names and figure everything out.

A bad thing, it passes the table once for each column.

If you wanted to go that route, I would have the procedure build a SQL statement for the entire table and then execute it.  Getting all the counts in one pass of the table.

The single SQL statement requires quite a bit of editing to make if you need multiple tables, but is a quick way to pass the table only once to get the results.