SQL Scriprt find non matching records

SQLSearcher
SQLSearcher used Ask the Experts™
on
Hello Experts Exchange
I have a SQL query to find non matching records between two tables.  The tables I have is TableA which has employee's and their booked hours, TableB which is employee's clocked hours.

I am finding that in TableA there is more booked hours than clocked hours, but I am unable to find the over booked hours using a query.  This is because there is no record in TableB.

How I match records is by the First Name=First Name, Surname=Surname, and Date=Date.

Here is a query I wrote to find non matching records. But it does not find all non matching records.

Select a.[First Name],a.[Surname],
		a.[Clock Number],a.[Cost Centre], b.Location,
		a.[shift],
		Convert(varchar,a.[Date],103) as [Date],
		a.[TimeCal] as [Losses Attendance Hours],b.[Hours] as [PrimetimeHours], 
		a.[TimeCal] - b.[Hours] as [Difference]
from [dbo].[T&A_Temp] a
	left join [dbo].[T&A_Temp_Primetime] b 
		on a.[First Name]= b.[First Name] 
		and a.surname = b.surname
		and a.[date] = b.[date]
Where a.[TimeCal] > 0
and b.[Hours] is null
order by a.[Cost Centre],[shift]

Open in new window


How can I change this query to get me the over booked hours?

Regards

SQLSearcher
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
Try this:
Select a.[First Name],a.[Surname],
		a.[Clock Number],a.[Cost Centre], b.Location,
		a.[shift],
		Convert(varchar,a.[Date],103) as [Date],
		a.[TimeCal] as [Losses Attendance Hours],
                b.[Hours] as [PrimetimeHours], 
		a.[TimeCal] - coalesce(b.[Hours],0) as [Difference]
from [dbo].[T&A_Temp] a
	left join [dbo].[T&A_Temp_Primetime] b 
		on a.[First Name]= b.[First Name] 
		and a.surname = b.surname
		and a.[date] = b.[date]
Where a.[TimeCal] > 0
and (b.[Hours] is null or b.[hours]<>a.[timecall])
order by a.[Cost Centre],[shift]

Open in new window

Vitor MontalvãoMSSQL Senior Engineer
Distinguished Expert 2017

Commented:
I don't know why are you trying to select field from TableB since they should be NULL (no exist, right?).
Try this query and check if is what you need:
Select a.[First Name],a.[Surname], a.[Clock Number],a.[Cost Centre], a.[shift],
		Convert(varchar,a.[Date],103) as [Date], a.[TimeCal] as [Losses Attendance Hours]
from [dbo].[T&A_Temp] a
Where a.[TimeCal] > 0 and 
	Not Exists (SELECT 1 
				FROM [dbo].[T&A_Temp_Primetime] b 
				WHERE a.[First Name]= b.[First Name] and a.surname = b.surname and a.[date] = b.[date]
					and b.[Hours] is null)
order by a.[Cost Centre],a.[shift]

Open in new window

Author

Commented:
Hello Vitor
Yes you are right I do not need columns from TableB as they will be null.

Your script does not work it brings back matching records, it returned 2345 records, and I'm expecting to see less than 100 records.

Regards

SQLSearcher
Should you be charging more for IT Services?

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Vitor MontalvãoMSSQL Senior Engineer
Distinguished Expert 2017

Commented:
Can you please post a sample of data so we can make tests?
Just a subset of the data and then you need to mark which one matches and which one not.

Commented:
Is it the case that non of the non-matched records exist in table B, or will some exist there but have a different value for hours?

Author

Commented:
Hello Vitor
Please find Sample Data attached in Excel file, I have labelled the sheets TableA and TableB.

Some Records are missing for the employee Rafal Dudek.

Regards

SQLSearcher
Sample-Data.xls

Author

Commented:
Hello SimonAdept
The data I am after is where a record does not exist in TableB.  I have already got a query to find where hours do not match up.

Regards

SQLSearcher
Vitor MontalvãoMSSQL Senior Engineer
Distinguished Expert 2017

Commented:
Which are the records that should be returned by the query?
TABLE A:
Rafal      Dudek      0      0      0      01.11.2014      0      Htr6 (586)      NULL      Red      0      2131
Rafal      Dudek      0      0      0      02.11.2014      0      Htr6 (586)      NULL      Red      0      2132
Rafal      Dudek      0      0      7.5      03.11.2014      7.5      Htr6 (586)      NULL      Red      0      2133
Rafal      Dudek      0      0      7.5      04.11.2014      7.5      Htr6 (586)      NULL      Red      0      2134
Rafal      Dudek      0      0      7.5      05.11.2014      7.5      Htr6 (586)      NULL      Red      0      2135
Rafal      Dudek      0      0      7.5      06.11.2014      7.5      Htr6 (586)      NULL      Red      0      2136
Rafal      Dudek      0      0      7.5      07.11.2014      7.5      Htr6 (586)      NULL      Red      0      2137
Rafal      Dudek      0      0      0      08.11.2014      0      Htr6 (586)      NULL      Red      0      2138
Rafal      Dudek      5      0      0      09.11.2014      5      Htr6 (586)      NULL      Red      0      2139
Rafal      Dudek      0      0      0      10.11.2014      0      Htr6 (586)      NULL      Red      0      2140
Rafal      Dudek      0      0      7.5      11.11.2014      7.5      Htr6 (586)      NULL      Red      0      2141
Rafal      Dudek      0      0      7.5      12.11.2014      7.5      Htr6 (586)      NULL      Red      0      2142
Rafal      Dudek      0      0      7.5      13.11.2014      7.5      Htr6 (586)      NULL      Red      0      2143
Rafal      Dudek      0      0      7.5      14.11.2014      7.5      Htr6 (586)      NULL      Red      0      2144

or TABLE B:
586 - Heater 6      Morning      Rafal Dudek      02.11.2014      0      Rafal      Dudek      NULL      506
586 - Heater 6      Morning      Rafal Dudek      01.11.2014      0      Rafal      Dudek      NULL      505
586 - Heater 6      Afternoon      Rafal Dudek      03.11.2014      7.5      Rafal      Dudek      NULL      2621

Author

Commented:
Hi Vitor
These should be the ones that return.
TableA
Rafal      Dudek      0      0      7.5      04.11.2014      7.5      Htr6 (586)      NULL      Red      0      2134
 Rafal      Dudek      0      0      7.5      05.11.2014      7.5      Htr6 (586)      NULL      Red      0      2135
 Rafal      Dudek      0      0      7.5      06.11.2014      7.5      Htr6 (586)      NULL      Red      0      2136
 Rafal      Dudek      0      0      7.5      07.11.2014      7.5      Htr6 (586)      NULL      Red      0      2137
 Rafal      Dudek      0      0      0      08.11.2014      0      Htr6 (586)      NULL      Red      0      2138
 Rafal      Dudek      5      0      0      09.11.2014      5      Htr6 (586)      NULL      Red      0      2139
 Rafal      Dudek      0      0      0      10.11.2014      0      Htr6 (586)      NULL      Red      0      2140
 Rafal      Dudek      0      0      7.5      11.11.2014      7.5      Htr6 (586)      NULL      Red      0      2141
 Rafal      Dudek      0      0      7.5      12.11.2014      7.5      Htr6 (586)      NULL      Red      0      2142
 Rafal      Dudek      0      0      7.5      13.11.2014      7.5      Htr6 (586)      NULL      Red      0      2143
 Rafal      Dudek      0      0      7.5      14.11.2014      7.5      Htr6 (586)      NULL      Red      0      2144

Regards

SQLSearcher
Commented:
SELECT 
TableA.MergeID, TableA.Date
, TableA.FirstName
, TableA.Surname
, TableA.Attendance
, TableB.Hours
FROM TableA LEFT JOIN TableB ON (TableA.FirstName = TableB.[First Name]) AND (TableA.Surname = TableB.Surname) AND (TableA.[Date] = TableB.[Date])
WHERE (((TableB.Date) Is Null));

Open in new window


Returns 39 records. 8 of those have more than zero hours in the attendance column

This query gets all cols from TableA where no record in TableB and > 0 in attendance col
SELECT TableA.*, TableA.Attendance
FROM TableA LEFT JOIN TableB ON (TableA.FirstName = TableB.[First Name]) AND (TableA.Surname = TableB.Surname) AND (TableA.[Date] = TableB.[Date])
WHERE (((TableB.Date) Is Null) AND ((TableA.Attendance)>0));

Open in new window

Vitor MontalvãoMSSQL Senior Engineer
Distinguished Expert 2017
Commented:
But in your main post you have the following criteria:
a.[TimeCal] > 0 and b.[Hours] is null
Since you don't have null values in b.Hours that's why is returning more that expected. If you don't want to have those filter you can't remove them, like this:
Select a.[First Name],a.[Surname], a.[Clock Number],a.[Cost Centre], a.[shift],
		Convert(varchar,a.[Date],103) as [Date], a.[TimeCal] as [Losses Attendance Hours]
from [dbo].[T&A_Temp] a
Where Not Exists (SELECT 1 
		FROM [dbo].[T&A_Temp_Primetime] b 
		WHERE a.[First Name]= b.[First Name] and a.surname = b.surname and a.[date] = b.[date])
order by a.[Cost Centre],a.[shift]

Open in new window

Hello Guys
Thank you for the queries, after trying them I realised that it is duplicate records that are my problem. I have now found a solution, so will give you both a share of the points.

Regards

SQLSearcher

Author

Commented:
The question process helped me to find the issue.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial