bfuchs
asked on
Query comparing two tables with different format needed.
Hi Experts,
I need to compare two files containing same data with different format.
They have following columns
Patients Name
Caregiver name
Visit Date
Visit time from
Visit Time to
would need to get a list of all unmatched records.
its only considered a match if all 5 fields are the same.
Attached sample of those files, where record in both files is considered a match.
Here are the differences, one of the files has the following
1- first and last name concatenated into one field.
2- has data in parenthesis after the names which should be ignored.
3- format of time is 1459-2304
Visit-Report.xls
V_TestHHASchedule.xls
I need to compare two files containing same data with different format.
They have following columns
Patients Name
Caregiver name
Visit Date
Visit time from
Visit Time to
would need to get a list of all unmatched records.
its only considered a match if all 5 fields are the same.
Attached sample of those files, where record in both files is considered a match.
Here are the differences, one of the files has the following
1- first and last name concatenated into one field.
2- has data in parenthesis after the names which should be ignored.
3- format of time is 1459-2304
Visit-Report.xls
V_TestHHASchedule.xls
ASKER
hi,
currently data is coming from 3rd party in Excel format, can import anywhere...
would prefer Access if possible...
1459-2304 is HHMM-HHMM
Thanks,
Ben
currently data is coming from 3rd party in Excel format, can import anywhere...
would prefer Access if possible...
1459-2304 is HHMM-HHMM
Thanks,
Ben
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
@PortletPaul,
Wonder why I did not get an email notifying your post, neither did it showed up under notifications, perhaps something is malfunctioning in EE.
Currently I use SQL 2008, what entails to convert it to that version?
Thanks,
Ben
Wonder why I did not get an email notifying your post, neither did it showed up under notifications, perhaps something is malfunctioning in EE.
Along these lines in SQL Server 2012 or later.So you mean its not compatible with earlier versions of SQL?
Currently I use SQL 2008, what entails to convert it to that version?
Thanks,
Ben
the format() function was introduced in SQL2012 (which is 6 years ago already)
Instead of FORMAT() you have to use CONVERT() instead which isn't that simple to do I'm afraid, but its because your db version is old.
see: https://www.experts-exchange.com/articles/12315/SQL-Server-Date-Styles-formats-using-CONVERT.html?headerLink=workspace_article
Instead of FORMAT() you have to use CONVERT() instead which isn't that simple to do I'm afraid, but its because your db version is old.
see: https://www.experts-exchange.com/articles/12315/SQL-Server-Date-Styles-formats-using-CONVERT.html?headerLink=workspace_article
For the string columns, I would use CHARINDEX() in SQL or INSTR() in Access and check if those columns (from V_TestHHASchedule) EXISTS in the VISITS-REPORT WHERE charindex(Patients_Firstna me,Patient name) > 0 and charindex(Patients_Lastnam e,Patientn ame) > 0 (etc).
Access has INSTR but (more logically) the search string is first and the item being searched for is second ie INSTR(patientname,patients _firstname ) > 0 and INSTR(patientname,patients _lastname) > 0 (etc)
Same again for caregiver (checking if it is contained within Aidename) and any other string.
Now for VisitDate, I would use datediff() e.g. in SQL Server : datediff(d,Schedule_VisitD ate,visitd ate) = 0
In Access : datediff("d",Schedule_Visi tDate,visi tdate) = 0
So that leaves us with the time....
And going to have to convert to the VisitTime format
In Access : Format(Schedule_VisitStart Time,"hhnn ")+"-"+For mat(Schedu le_VisitEn dTime,"hhn n") = VisitTime
and while we can use Format in SQL Server - it only happened from SQL 2012 and is basically the same as Access (use single quotes)
In SQL2008 : replace(convert(varchar(5) ,Schedule_ VisitStart Time,108)+ '-'+conver t(varchar( 5),Schedul e_VisitEnd Time,108), ':','')
So, the basic approach in either SQL Server or Access will be :
Select *
from V_TestHHASchedule V
WHERE EXISTS (SELECT NULL FROM VISITS-REPORT R WHERE <the above conditions hold true>)
and that will show rows from V_TestHHASchedule that are also in Visits_Report
to show non-matching, then use : NOT EXISTS
Does that make sense ?
Access has INSTR but (more logically) the search string is first and the item being searched for is second ie INSTR(patientname,patients
Same again for caregiver (checking if it is contained within Aidename) and any other string.
Now for VisitDate, I would use datediff() e.g. in SQL Server : datediff(d,Schedule_VisitD
In Access : datediff("d",Schedule_Visi
So that leaves us with the time....
And going to have to convert to the VisitTime format
In Access : Format(Schedule_VisitStart
and while we can use Format in SQL Server - it only happened from SQL 2012 and is basically the same as Access (use single quotes)
In SQL2008 : replace(convert(varchar(5)
So, the basic approach in either SQL Server or Access will be :
Select *
from V_TestHHASchedule V
WHERE EXISTS (SELECT NULL FROM VISITS-REPORT R WHERE <the above conditions hold true>)
and that will show rows from V_TestHHASchedule that are also in Visits_Report
to show non-matching, then use : NOT EXISTS
Does that make sense ?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
@Mark,
Just testing your SQL version, and no records comes up.
when changing it to not exists then all records come up.
also we must use the not exists as I'm looking for the unmatched ones...
Thanks,
Ben
Just testing your SQL version, and no records comes up.
when changing it to not exists then all records come up.
also we must use the not exists as I'm looking for the unmatched ones...
Thanks,
Ben
ASKER
@Mark,
The time conversion is definitely a not functioning properly.
see below
11/17/2018 7:15:00 AM 11/17/2018 7:00:00 PM 11/17-11/17
11/2/2018 7:00:00 AM 11/2/2018 9:00:00 AM 11/2/-11/2/
11/2/2018 2:00:00 PM 11/2/2018 7:00:00 PM 11/2/-11/2/
11/9/2018 7:15:00 AM 11/9/2018 9:00:00 AM 11/9/-11/9/
11/9/2018 1:45:00 PM 11/9/2018 7:15:00 PM 11/9/-11/9/
11/9/2018 9:00:00 AM 11/9/2018 1:45:00 PM 11/9/-11/9/
11/3/2018 7:00:00 AM 11/3/2018 7:00:00 PM 11/3/-11/3/
11/8/2018 7:00:00 AM 11/8/2018 2:00:00 PM 11/8/-11/8/
11/8/2018 2:00:00 PM 11/8/2018 7:00:00 PM 11/8/-11/8/
11/2/2018 9:00:00 AM 11/2/2018 2:00:00 PM 11/2/-11/2/
11/16/2018 7:15:00 AM 11/16/2018 9:00:00 AM 11/16-11/16
Thanks,
Ben
The time conversion is definitely a not functioning properly.
see below
SELECT V.Schedule_VisitStartTime,V.Schedule_VisitEndTime,replace(convert(varchar(5),V.Schedule_VisitStartTime,108)+'-'+convert(varchar(5),V.Schedule_VisitEndTime,108),':','')
FROM V_TestHHASchedule AS V
11/10/2018 7:00:00 AM 11/10/2018 7:00:00 PM 11/10-11/1011/17/2018 7:15:00 AM 11/17/2018 7:00:00 PM 11/17-11/17
11/2/2018 7:00:00 AM 11/2/2018 9:00:00 AM 11/2/-11/2/
11/2/2018 2:00:00 PM 11/2/2018 7:00:00 PM 11/2/-11/2/
11/9/2018 7:15:00 AM 11/9/2018 9:00:00 AM 11/9/-11/9/
11/9/2018 1:45:00 PM 11/9/2018 7:15:00 PM 11/9/-11/9/
11/9/2018 9:00:00 AM 11/9/2018 1:45:00 PM 11/9/-11/9/
11/3/2018 7:00:00 AM 11/3/2018 7:00:00 PM 11/3/-11/3/
11/8/2018 7:00:00 AM 11/8/2018 2:00:00 PM 11/8/-11/8/
11/8/2018 2:00:00 PM 11/8/2018 7:00:00 PM 11/8/-11/8/
11/2/2018 9:00:00 AM 11/2/2018 2:00:00 PM 11/2/-11/2/
11/16/2018 7:15:00 AM 11/16/2018 9:00:00 AM 11/16-11/16
Thanks,
Ben
ASKER
Thanks experts,
I ended up using a combination of both.
Thank you!
I ended up using a combination of both.
SELECT V.*
FROM V_TestHHASchedule AS V
WHERE not Exists (SELECT NULL FROM [VisitReport] R
--WHERE charindex(V.patients_firstname,R.pat ientname) > 0
where RTRIM( LEFT( r.PatientName, CHARINDEX( ' (', r.PatientName + ' (' ) ) ) = RTRIM( LEFT( v.patients_lastname, CHARINDEX( ' (', v.patients_lastname + ' (' ) ) ) + ' ' + RTRIM( LEFT( v.patients_firstname, CHARINDEX( ' (', v.patients_firstname + ' (' ) ) )
and RTRIM( LEFT( r.aideName, CHARINDEX( ' (', r.aideName + ' (' ) ) ) = RTRIM( LEFT( v.caregivers_lastname, CHARINDEX( ' (', v.caregivers_lastname + ' (' ) ) ) + ' ' + RTRIM( LEFT( v.caregivers_firstname, CHARINDEX( ' (', v.caregivers_firstname + ' (' ) ) )
and datediff(d,V.Schedule_VisitDate,R.vi sitdate) = 0
and replace(convert(varchar(5),V.Schedul e_VisitSta rtTime,108 )+'-'+conv ert(varcha r(5),V.Sch edule_Visi tEndTime,1 08),':','' ) = R.scheduled
Thank you!
ASKER
Hi Experts,
One fix if you dont mind...
The file V_TestHHASchedule when its midnight it comes with only a date portion without the times.
How can that be converted to 00:00 so it will match the other file?
Thanks,
Ben
One fix if you dont mind...
The file V_TestHHASchedule when its midnight it comes with only a date portion without the times.
How can that be converted to 00:00 so it will match the other file?
Thanks,
Ben
It all depends on datatypes.
I too found that the time portion was mucking up after importing the spreadsheet.
Found that it was coming in as varchar. A simple change to the structure of the table (keeping data in place) :
It does depend a bit on how that table is structured and populated.
This is how I imported :
Hope that helps....
I too found that the time portion was mucking up after importing the spreadsheet.
Found that it was coming in as varchar. A simple change to the structure of the table (keeping data in place) :
alter table V_TestHHASchedule alter column schedule_visitdate datetime
alter table V_TestHHASchedule alter column schedule_visitstarttime datetime
alter table V_TestHHASchedule alter column schedule_visitendtime datetime
And everthing worked as expected....It does depend a bit on how that table is structured and populated.
This is how I imported :
set dateformat MDY
select * into V_TestHHASchedule
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0', 'Excel 12.0 Xml;Database=c:\mrwtest\ee\V_TestHHASchedule.xls;HDR=yes', 'SELECT * FROM [V_TestHHASchedule$]') as a
go
select * from INFORMATION_SCHEMA.columns where table_name = 'V_TestHHASchedule'
go
alter table V_TestHHASchedule alter column schedule_visitdate datetime
alter table V_TestHHASchedule alter column schedule_visitstarttime datetime
alter table V_TestHHASchedule alter column schedule_visitendtime datetime
go
By way of comparison, look at what happens when convert works with a string vs a datetime :
select replace(convert(varchar(5),'2018-12-12 07:08:09.123',108)+'-'+convert(varchar(5),'2018-12-12 08:09:10.123',108),':','')
select replace(convert(varchar(5),cast('2018-12-12 07:08:09.123' as datetime),108)+'-'+convert(varchar(5),cast('2018-12-12 08:09:10.123' as datetime),108),':','')
In the string date, the STYLE CODE of 108 basically tells Convert how to read the string. When it is a datetime, the style code of 108 tells converrt how to format the datetime. Hope that helps....
Life is simpler when the format() function becomes available.
(Hint Hint)
(Hint Hint)
In another thread....
Ben has said they have to comply with the lowest common denominator - SQL 2008
It was strongly suggested to plan upgrades to SQL 2012 or more recent.
There is so much more to offer in SQL2012+
And by the way, if it was a varchar column, then the format(myvarchar,'date/tim e specifiers') would fail in SQL Server.
Ben has said they have to comply with the lowest common denominator - SQL 2008
It was strongly suggested to plan upgrades to SQL 2012 or more recent.
There is so much more to offer in SQL2012+
And by the way, if it was a varchar column, then the format(myvarchar,'date/tim
ASKER
Hi Experts,
See following
https://www.experts-exchange.com/questions/29129307/Tips-to-figure-out-query-results.html#questionAdd
Thanks,
Ben
See following
https://www.experts-exchange.com/questions/29129307/Tips-to-figure-out-query-results.html#questionAdd
Thanks,
Ben
i.e. which database syntax do you want for the comparison?
what date/time is "1459-2304" in yyyy-mm-dd hh:mm format ? (i.e. how is this compared to a normal date/time?)