troubleshooting Question

What are the best practices for placement of SQL when checking for duplicates? WHERE clause JOINS, beginning or end?

Avatar of RUA Volunteer2?
RUA Volunteer2?Flag for United States of America asked on
Microsoft AccessMicrosoft SQL ServerMicrosoft Server OSSSRSSQL
8 Comments4 Solutions43 ViewsLast Modified:
Where do you put a checking for and removing duplicates query when you need to remove all extras. The goal is to check name, date of birth, address of person.  Then to make sure that I am getting only the last person from their most recent visit or date time stamp occurrence. (We'll call it their most recent date of service) or in this case "s._serviced_date"
I think it will look something like this
Select Name, Address, Date of Birth From Personnel
Where Max(Service_Date) ----Do I need any date formatting here or just the field?
Group By Name, Address, Date of Birth
Having Count(*) >1

Please feel free to check that. Especially If MAX() has to have special date format. May not need any?
THE BIG QUESTION HERE IS WHERE DO I PLACE IT AND WHY. WHAT ARE THE BEST PRACTICES OR WHAT IS GOOD POLICY TO DO IT ONE WAY OVER ANOTHER? the following is a mockup of the query. 

declare @from_dob date = dateadd(year,-45,'20210101')
declare @to_dob date = dateadd(year,-40,'20211231')
declare @research_date date = '20150101'
declare @region varchar(100)
        select @region = dbo.z_bi_get_region(db_name())
 
select 
@region as region
, personnel_id
, last_name
, first_name
, gender_code
, birth_date
,datediff(year,birth_date,cast(getdate() as date)) as age_at_run_date
, mobile_phone_number
, home_phone_number
, primary_email_address
, primary_address_line1
, primary_address_line2
, primary_address_city
, primary_address_state
, primary_address_zipcode
, death_flag 
 ,isnull(s1.disease_found,'N') as disease_found
 ,@from_dob as from_DOB, @to_dob as to_DOB, @research_date as research_date
 
from
   c_personnel pers
   outer apply
   (
     select top 1 'Y' as disease_found from c_research s
       inner join c_research_item si on s.research_key=si.research_key and si.active_flag='Y'
       inner join l_procedure_code pc on si.procedure_code=pc.procedure_code 
       inner join l_disease_type dt on pc.disease_type_code=mt.disease_type_code and dt.disease_type_code in ('Cancer','Heart','Kidney',’Lung’,'Accident',’Neurological’)
       inner join l_location st on s._location_code=st.location_code
     where 
       pers.personnel_key=s.personnel_key
       and s._serviced_date  >= @research_date
       AND st.location_code NOT IN ('HP','CLNK') AND st.practice_code <> 'DocNABox'
   ) as s1
where 
   pers.birth_date >= @from_dob 
   and pers.birth_date < dateadd(d,1,@to_dob)
   and pers.master_personnel_key is null
   AND pers.issuer_of_personnel_id = 'system' 
   and pers.gender_code='M'
   and isnull(pers.death_flag,'N')='N'
   and isnull(s1.disease_found,'N')='N'
  order by datediff(year,birth_date,cast(getdate() as date))
THE BIG QUESTION HERE IS WHERE DO I PLACE IT AND WHY. WHAT ARE THE BEST PRACTICES OR WHAT IS GOOD POLICY TO DO IT ONE WAY OVER ANOTHER? the following is a mockup of the query. 


ASKER CERTIFIED SOLUTION
Join our community to see this answer!
Unlock 4 Answers and 8 Comments.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 4 Answers and 8 Comments.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros