Solved

SQL Scriprt find non matching records

Posted on 2014-12-03
13
132 Views
Last Modified: 2014-12-08
Hello Experts Exchange
I have a SQL query to find non matching records between two tables.  The tables I have is TableA which has employee's and their booked hours, TableB which is employee's clocked hours.

I am finding that in TableA there is more booked hours than clocked hours, but I am unable to find the over booked hours using a query.  This is because there is no record in TableB.

How I match records is by the First Name=First Name, Surname=Surname, and Date=Date.

Here is a query I wrote to find non matching records. But it does not find all non matching records.

Select a.[First Name],a.[Surname],
		a.[Clock Number],a.[Cost Centre], b.Location,
		a.[shift],
		Convert(varchar,a.[Date],103) as [Date],
		a.[TimeCal] as [Losses Attendance Hours],b.[Hours] as [PrimetimeHours], 
		a.[TimeCal] - b.[Hours] as [Difference]
from [dbo].[T&A_Temp] a
	left join [dbo].[T&A_Temp_Primetime] b 
		on a.[First Name]= b.[First Name] 
		and a.surname = b.surname
		and a.[date] = b.[date]
Where a.[TimeCal] > 0
and b.[Hours] is null
order by a.[Cost Centre],[shift]

Open in new window


How can I change this query to get me the over booked hours?

Regards

SQLSearcher
0
Comment
Question by:SQLSearcher
  • 6
  • 4
  • 3
13 Comments
 
LVL 18

Expert Comment

by:SimonAdept
ID: 40478173
Try this:
Select a.[First Name],a.[Surname],
		a.[Clock Number],a.[Cost Centre], b.Location,
		a.[shift],
		Convert(varchar,a.[Date],103) as [Date],
		a.[TimeCal] as [Losses Attendance Hours],
                b.[Hours] as [PrimetimeHours], 
		a.[TimeCal] - coalesce(b.[Hours],0) as [Difference]
from [dbo].[T&A_Temp] a
	left join [dbo].[T&A_Temp_Primetime] b 
		on a.[First Name]= b.[First Name] 
		and a.surname = b.surname
		and a.[date] = b.[date]
Where a.[TimeCal] > 0
and (b.[Hours] is null or b.[hours]<>a.[timecall])
order by a.[Cost Centre],[shift]

Open in new window

0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40478287
I don't know why are you trying to select field from TableB since they should be NULL (no exist, right?).
Try this query and check if is what you need:
Select a.[First Name],a.[Surname], a.[Clock Number],a.[Cost Centre], a.[shift],
		Convert(varchar,a.[Date],103) as [Date], a.[TimeCal] as [Losses Attendance Hours]
from [dbo].[T&A_Temp] a
Where a.[TimeCal] > 0 and 
	Not Exists (SELECT 1 
				FROM [dbo].[T&A_Temp_Primetime] b 
				WHERE a.[First Name]= b.[First Name] and a.surname = b.surname and a.[date] = b.[date]
					and b.[Hours] is null)
order by a.[Cost Centre],a.[shift]

Open in new window

0
 

Author Comment

by:SQLSearcher
ID: 40478305
Hello Vitor
Yes you are right I do not need columns from TableB as they will be null.

Your script does not work it brings back matching records, it returned 2345 records, and I'm expecting to see less than 100 records.

Regards

SQLSearcher
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40478312
Can you please post a sample of data so we can make tests?
Just a subset of the data and then you need to mark which one matches and which one not.
0
 
LVL 18

Expert Comment

by:SimonAdept
ID: 40478352
Is it the case that non of the non-matched records exist in table B, or will some exist there but have a different value for hours?
0
 

Author Comment

by:SQLSearcher
ID: 40478360
Hello Vitor
Please find Sample Data attached in Excel file, I have labelled the sheets TableA and TableB.

Some Records are missing for the employee Rafal Dudek.

Regards

SQLSearcher
Sample-Data.xls
0
Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

 

Author Comment

by:SQLSearcher
ID: 40478368
Hello SimonAdept
The data I am after is where a record does not exist in TableB.  I have already got a query to find where hours do not match up.

Regards

SQLSearcher
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40478388
Which are the records that should be returned by the query?
TABLE A:
Rafal      Dudek      0      0      0      01.11.2014      0      Htr6 (586)      NULL      Red      0      2131
Rafal      Dudek      0      0      0      02.11.2014      0      Htr6 (586)      NULL      Red      0      2132
Rafal      Dudek      0      0      7.5      03.11.2014      7.5      Htr6 (586)      NULL      Red      0      2133
Rafal      Dudek      0      0      7.5      04.11.2014      7.5      Htr6 (586)      NULL      Red      0      2134
Rafal      Dudek      0      0      7.5      05.11.2014      7.5      Htr6 (586)      NULL      Red      0      2135
Rafal      Dudek      0      0      7.5      06.11.2014      7.5      Htr6 (586)      NULL      Red      0      2136
Rafal      Dudek      0      0      7.5      07.11.2014      7.5      Htr6 (586)      NULL      Red      0      2137
Rafal      Dudek      0      0      0      08.11.2014      0      Htr6 (586)      NULL      Red      0      2138
Rafal      Dudek      5      0      0      09.11.2014      5      Htr6 (586)      NULL      Red      0      2139
Rafal      Dudek      0      0      0      10.11.2014      0      Htr6 (586)      NULL      Red      0      2140
Rafal      Dudek      0      0      7.5      11.11.2014      7.5      Htr6 (586)      NULL      Red      0      2141
Rafal      Dudek      0      0      7.5      12.11.2014      7.5      Htr6 (586)      NULL      Red      0      2142
Rafal      Dudek      0      0      7.5      13.11.2014      7.5      Htr6 (586)      NULL      Red      0      2143
Rafal      Dudek      0      0      7.5      14.11.2014      7.5      Htr6 (586)      NULL      Red      0      2144

or TABLE B:
586 - Heater 6      Morning      Rafal Dudek      02.11.2014      0      Rafal      Dudek      NULL      506
586 - Heater 6      Morning      Rafal Dudek      01.11.2014      0      Rafal      Dudek      NULL      505
586 - Heater 6      Afternoon      Rafal Dudek      03.11.2014      7.5      Rafal      Dudek      NULL      2621
0
 

Author Comment

by:SQLSearcher
ID: 40478397
Hi Vitor
These should be the ones that return.
TableA
Rafal      Dudek      0      0      7.5      04.11.2014      7.5      Htr6 (586)      NULL      Red      0      2134
 Rafal      Dudek      0      0      7.5      05.11.2014      7.5      Htr6 (586)      NULL      Red      0      2135
 Rafal      Dudek      0      0      7.5      06.11.2014      7.5      Htr6 (586)      NULL      Red      0      2136
 Rafal      Dudek      0      0      7.5      07.11.2014      7.5      Htr6 (586)      NULL      Red      0      2137
 Rafal      Dudek      0      0      0      08.11.2014      0      Htr6 (586)      NULL      Red      0      2138
 Rafal      Dudek      5      0      0      09.11.2014      5      Htr6 (586)      NULL      Red      0      2139
 Rafal      Dudek      0      0      0      10.11.2014      0      Htr6 (586)      NULL      Red      0      2140
 Rafal      Dudek      0      0      7.5      11.11.2014      7.5      Htr6 (586)      NULL      Red      0      2141
 Rafal      Dudek      0      0      7.5      12.11.2014      7.5      Htr6 (586)      NULL      Red      0      2142
 Rafal      Dudek      0      0      7.5      13.11.2014      7.5      Htr6 (586)      NULL      Red      0      2143
 Rafal      Dudek      0      0      7.5      14.11.2014      7.5      Htr6 (586)      NULL      Red      0      2144

Regards

SQLSearcher
0
 
LVL 18

Assisted Solution

by:SimonAdept
SimonAdept earned 250 total points
ID: 40478405
SELECT 
TableA.MergeID, TableA.Date
, TableA.FirstName
, TableA.Surname
, TableA.Attendance
, TableB.Hours
FROM TableA LEFT JOIN TableB ON (TableA.FirstName = TableB.[First Name]) AND (TableA.Surname = TableB.Surname) AND (TableA.[Date] = TableB.[Date])
WHERE (((TableB.Date) Is Null));

Open in new window


Returns 39 records. 8 of those have more than zero hours in the attendance column

This query gets all cols from TableA where no record in TableB and > 0 in attendance col
SELECT TableA.*, TableA.Attendance
FROM TableA LEFT JOIN TableB ON (TableA.FirstName = TableB.[First Name]) AND (TableA.Surname = TableB.Surname) AND (TableA.[Date] = TableB.[Date])
WHERE (((TableB.Date) Is Null) AND ((TableA.Attendance)>0));

Open in new window

0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 250 total points
ID: 40478420
But in your main post you have the following criteria:
a.[TimeCal] > 0 and b.[Hours] is null
Since you don't have null values in b.Hours that's why is returning more that expected. If you don't want to have those filter you can't remove them, like this:
Select a.[First Name],a.[Surname], a.[Clock Number],a.[Cost Centre], a.[shift],
		Convert(varchar,a.[Date],103) as [Date], a.[TimeCal] as [Losses Attendance Hours]
from [dbo].[T&A_Temp] a
Where Not Exists (SELECT 1 
		FROM [dbo].[T&A_Temp_Primetime] b 
		WHERE a.[First Name]= b.[First Name] and a.surname = b.surname and a.[date] = b.[date])
order by a.[Cost Centre],a.[shift]

Open in new window

0
 

Accepted Solution

by:
SQLSearcher earned 0 total points
ID: 40478642
Hello Guys
Thank you for the queries, after trying them I realised that it is duplicate records that are my problem. I have now found a solution, so will give you both a share of the points.

Regards

SQLSearcher
0
 

Author Closing Comment

by:SQLSearcher
ID: 40486385
The question process helped me to find the issue.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
MS SQL 2016 from Database to Datawarehouse 6 37
separate column 24 21
Mssql SQL query 14 28
Error in query 2 0
This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now