jose11au
asked on
SQL - Data quality records that are missing addresses
Hi All,
Could I please ask if this query is correct?
Basically I am looking for data quality issues where a record is missing address details. If either address_1, address_2 or address_3 is populated then the record is okay.
Could I please ask if this query is correct?
Basically I am looking for data quality issues where a record is missing address details. If either address_1, address_2 or address_3 is populated then the record is okay.
select *
from Sales
where address_1 Is Null or address_2 Is Null or address_3 Is Null
AND
First_name is not null or Last_name is not null
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
it's part of the arithmetic operation, just like:
(1 + 1) x 2 is different from 1 + (1 x 2)
(1 + 1) x 2 is different from 1 + (1 x 2)
ASKER
Thank you so much.
Not sure your logic is correct.
>>where (address_1 Is Null or address_2 Is Null or address_3 Is Null)
If all addresses are null, that returns true. So, it will return the row.
I might also suggest COALESCE. It takes multiple columns and returns the first non null value.
Something like this:
>>where (address_1 Is Null or address_2 Is Null or address_3 Is Null)
If all addresses are null, that returns true. So, it will return the row.
I might also suggest COALESCE. It takes multiple columns and returns the first non null value.
Something like this:
where coalesce(address_1, address_2, address_3) Is not Null
AND
coalesce(First_name, Last_name) is not null
I think you need something more like this:
SELECT *, STUFF(
CASE WHEN (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL) THEN ', Address is missing' ELSE '' END +
CASE WHEN (First_name IS NULL AND Last_name IS NULL) THEN ', Name is missing' ELSE '' END, 1, 2, '') AS what_data_is_bad
FROM dbo.Sales
WHERE (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL)
OR (First_name IS NULL AND Last_name IS NULL)
For example:
;WITH Sales AS (
SELECT * FROM (VALUES
(1, 'address_1', NULL, NULL, 'first_name', NULL),
(2, NULL, NULL, NULL, NULL, 'last_name'),
(3, NULL, NULL, 'address_3', NULL, NULL),
(4, NULL, NULL, NULL, NULL, NULL),
(5, NULL, NULL, 'address_3', NULL, 'last_name')
) AS sales(id, address_1, address_2, address_3, first_name, last_name)
)
SELECT *, STUFF(
CASE WHEN (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL) THEN ', Address is missing' ELSE '' END +
CASE WHEN (First_name IS NULL AND Last_name IS NULL) THEN ', Name is missing' ELSE '' END, 1, 2, '') AS what_data_is_bad
FROM Sales
WHERE (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL)
OR (First_name IS NULL AND Last_name IS NULL)
SELECT *, STUFF(
CASE WHEN (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL) THEN ', Address is missing' ELSE '' END +
CASE WHEN (First_name IS NULL AND Last_name IS NULL) THEN ', Name is missing' ELSE '' END, 1, 2, '') AS what_data_is_bad
FROM dbo.Sales
WHERE (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL)
OR (First_name IS NULL AND Last_name IS NULL)
For example:
;WITH Sales AS (
SELECT * FROM (VALUES
(1, 'address_1', NULL, NULL, 'first_name', NULL),
(2, NULL, NULL, NULL, NULL, 'last_name'),
(3, NULL, NULL, 'address_3', NULL, NULL),
(4, NULL, NULL, NULL, NULL, NULL),
(5, NULL, NULL, 'address_3', NULL, 'last_name')
) AS sales(id, address_1, address_2, address_3, first_name, last_name)
)
SELECT *, STUFF(
CASE WHEN (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL) THEN ', Address is missing' ELSE '' END +
CASE WHEN (First_name IS NULL AND Last_name IS NULL) THEN ', Name is missing' ELSE '' END, 1, 2, '') AS what_data_is_bad
FROM Sales
WHERE (address_1 IS NULL AND address_2 IS NULL AND address_3 IS NULL)
OR (First_name IS NULL AND Last_name IS NULL)
ASKER
I am learning SQL