• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 353
  • Last Modified:

how to change sql query not to pull duplicates

I have a view that is checking to see if patients are on certain meds but I only want to show each patient once. If they are on multiple meds I only want to include the one that has the most recent pm.start_date.

ALTER view [dbo].[meds]
as

SELECT p.last_name, p.first_name, p.person_id, pe.enc_id, pe.enc_timestamp,pm.start_date, pm.date_stopped, pm.date_last_refilled,
fm.brand_name, fm.generic_name, fm.dose    
FROM patient_medication pm with (nolock)
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE
((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))
0
biotec
Asked:
biotec
  • 13
  • 12
  • 5
  • +2
2 Solutions
 
ralmadaCommented:
You could try like this:
 

;with CTE as (
SELECT 	p.last_name, 
	p.first_name, 
	p.person_id, 
	pe.enc_id, 
	pe.enc_timestamp,
	pm.start_date, 
	pm.date_stopped, 
	pm.date_last_refilled, 
	fm.brand_name, 
	fm.generic_name, 
	fm.dose    
FROM patient_medication pm with (nolock) 
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE 
((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN')) 
)
select a.* from CTE a
where a.start_date = (select max(start_date) from CTE where person_id = a.person_id)

Open in new window

0
 
HainKurtSr. System AnalystCommented:
select * from (
SELECT p.last_name, p.first_name, p.person_id, pe.enc_id, pe.enc_timestamp,pm.start_date, pm.date_stopped, pm.date_last_refilled,
fm.brand_name, fm.generic_name, fm.dose,
row_number() OVER (PARTITION BY  person_id order by pm.start_date desc) as rn
FROM patient_medication pm with (nolock)
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE
((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))
) x where rn=1


more infor for row_number function: http://msdn.microsoft.com/en-us/library/ms186734.aspx
0
 
biotecAuthor Commented:
Interesting. Thanks for the quick response. It doesn't like something near the end with the x where rn=1
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
ralmadaCommented:
Have you tried mine?
0
 
js-profiCommented:
add distinct keyword for pm.person_id and add

  and pm.start_date = (select max(start_date) from patient_medication)

to where clause as already suggested by ralmada.

the distinct prevents from getting the same patient twice if changing med same day.
   
0
 
biotecAuthor Commented:
I'm sorry ralmada, somehow I went right past that. I'm going to try it right now.
0
 
ralmadaCommented:
By the way you have a couple of brackets issue in your original query, here is corrected one:
;with CTE as ( 
SELECT  p.last_name,  
        p.first_name,  
        p.person_id,  
        pe.enc_id,  
        pe.enc_timestamp, 
        pm.start_date,  
        pm.date_stopped,  
        pm.date_last_refilled,  
        fm.brand_name,  
        fm.generic_name,  
        fm.dose     
FROM patient_medication pm with (nolock)  
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id 
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id 
inner join person p on pe.person_id = p.person_id 
WHERE  

 (pm.date_stopped = '' OR pm.date_stopped is null) or 
 (pm.date_stopped > getdate()) and 
 (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))  
) 
select a.* from CTE a 
where a.start_date = (select max(start_date) from CTE where person_id = a.person_id)

--- Using row_number

select * from (
SELECT p.last_name, p.first_name, p.person_id, pe.enc_id, pe.enc_timestamp,pm.start_date, pm.date_stopped, pm.date_last_refilled, 
fm.brand_name, fm.generic_name, fm.dose,
row_number() OVER (PARTITION BY  person_id order by pm.start_date desc) as rn
FROM patient_medication pm with (nolock) 
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE 
(pm.date_stopped = '' OR pm.date_stopped is null) or 
(pm.date_stopped > getdate()) and 
(fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN')) 

) x where rn=1

Open in new window

0
 
biotecAuthor Commented:
Something got cut off. It starts like this:

;with CTE as (
0
 
ralmadaCommented:
Well since you already have a view created you can just query it like this:

select * from meds a
where a.start_date = (select max(start_date) from meds where person_id = a.person_id)

Open in new window

0
 
ralmadaCommented:
If you want to alter your view then drop the ";" in my example
alter view dbo.meds as
with CTE as (  
SELECT  p.last_name,   
        p.first_name,   
        p.person_id,   
        pe.enc_id,   
        pe.enc_timestamp,  
        pm.start_date,   
        pm.date_stopped,   
        pm.date_last_refilled,   
        fm.brand_name,   
        fm.generic_name,   
        fm.dose      
FROM patient_medication pm with (nolock)   
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id  
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id  
inner join person p on pe.person_id = p.person_id  
WHERE   
 
 (pm.date_stopped = '' OR pm.date_stopped is null) or  
 (pm.date_stopped > getdate()) and  
 (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))   
)  
select a.* from CTE a  
where a.start_date = (select max(start_date) from CTE where person_id = a.person_id)

Open in new window

0
 
HainKurtSr. System AnalystCommented:
whats the issue here?
SELECT *
FROM   (SELECT p.last_name,
               p.first_name,
               p.person_id,
               pe.enc_id,
               pe.enc_timestamp,
               pm.start_date,
               pm.date_stopped,
               pm.date_last_refilled,
               fm.brand_name,
               fm.generic_name,
               fm.dose,
               row_number() OVER (PARTITION BY person_id ORDER BY pm.start_date DESC) AS rn
        FROM   patient_medication AS pm WITH (NOLOCK) INNER JOIN
               fdb_medication AS fm WITH (NOLOCK)
               ON pm.ndc_id = fm.ndc_id INNER JOIN
               patient_encounter AS pe WITH (NOLOCK)
               ON pm.enc_id = pe.enc_id INNER JOIN
               person AS p
               ON pe.person_id = p.person_id
        WHERE  (pm.date_stopped = ''
                OR pm.date_stopped IS NULL)
               OR (pm.date_stopped > getdate())
                  AND (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))) AS x
WHERE  rn = 1;

Open in new window

0
 
HainKurtSr. System AnalystCommented:
ralmada, the one you posted @ 26192507 should be very slow compared to the one @ 26192552. In your solution for every record it needs to run a max query, say it returns 1000 records, your solution will run 1000 sub q + 1 main = 1001 query...
0
 
ralmadaCommented:
@Hainkurt, There were some problems with the brackets as I stated in my comment  26192393. But I would suggest, since the asker already has a view, just to use it like I've indicated in my comment 26192490.
Btw, using row_number instead of derived queries as I suggested is slower.
0
 
ralmadaCommented:
@Hainkurt, in response to your comment 26192582
using row_number for this cases is way more slower! I would suggest you to read this article:
 http://sqlservernation.com/blogs/brandongalderisi/archive/2009/09/20/using-row-number-vs-a-correlated-sub-select.aspx
0
 
biotecAuthor Commented:
Thanks. When I ran my original I got about 134 rows of which about 11 persons showed up in two places. When I do a select on this I get almost 2,000 rows.

I'm very confused on the as
with CTE as at the beginning as well as how you added the select a.* portion at the end. I don't know a lot about sql but this is definitely not something I've seen before.
0
 
ralmadaCommented:
Like I said in comment http:#a26192490, query you original view like this:
 select * from meds a  where a.start_date = (select max(start_date) from meds where person_id = a.person_id)
See your original view code corrected attached below:

ALTER view [dbo].[meds]
as 

SELECT p.last_name, p.first_name, p.person_id, pe.enc_id, pe.enc_timestamp,pm.start_date, pm.date_stopped, pm.date_last_refilled, 
fm.brand_name, fm.generic_name, fm.dose    
FROM patient_medication pm with (nolock) 
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE 
(pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate())
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))

Open in new window

0
 
HainKurtSr. System AnalystCommented:
whats happening on this query and how many you get?
SELECT * 
FROM   (SELECT p.last_name, 
               p.first_name, 
               p.person_id, 
               pe.enc_id, 
               pe.enc_timestamp, 
               pm.start_date, 
               pm.date_stopped, 
               pm.date_last_refilled, 
               fm.brand_name, 
               fm.generic_name, 
               fm.dose, 
               row_number() OVER (PARTITION BY person_id ORDER BY pm.start_date DESC) AS rn 
        FROM   patient_medication AS pm WITH (NOLOCK) INNER JOIN 
               fdb_medication AS fm WITH (NOLOCK) 
               ON pm.ndc_id = fm.ndc_id INNER JOIN 
               patient_encounter AS pe WITH (NOLOCK) 
               ON pm.enc_id = pe.enc_id INNER JOIN 
               person AS p 
               ON pe.person_id = p.person_id 
        WHERE  (pm.date_stopped = '' OR pm.date_stopped IS NULL) 
               OR (pm.date_stopped > getdate()) 
               AND (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))
       ) AS x 
WHERE  rn = 1;

Open in new window

0
 
js-profiCommented:
biotec, did you ever check your statement posted? it has not enough closed parantheses at end of where clause.
SELECT p.last_name, p.first_name, distinct pm.person_id, pe.enc_id, pe.enc_timestamp,pm.start_date, pm.date_stopped, pm.date_last_refilled, 
fm.brand_name, fm.generic_name, fm.dose    
FROM patient_medication pm with (nolock) 
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE 
((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))) 
and pm.start_date = (select max(pm2.start_date) from patient_medication pm2 where pm2.person_id = pm.person_id)

Open in new window

0
 
biotecAuthor Commented:
I'm confused. That select query against my original still returns over 1,800 rows. Your post 26192507
doesn't run either so I'm not sure. I'm hoping to pull a view for Crystal without having to run a query against the view.

HainKurt when I run yours I get over 700 rows so that it not working either. Thanks for your help everyone I think we're getting closer.
0
 
biotecAuthor Commented:
js-profi when I run yours I get:

Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'distinct'.
Msg 319, Level 15, State 1, Line 3
Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon.
Msg 319, Level 15, State 1, Line 4
Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon.
Msg 319, Level 15, State 1, Line 5
Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon.
0
 
ralmadaCommented:
What kind of data do you have in start_date? only dates or date and time?
0
 
ralmadaCommented:
If you have dates only, then you will have to end up with row_number despite the performance issue I've mentioned above.
But I would add and alias to the column being partitioned like this:
... row_number() OVER (PARTITION BY p.person_id ORDER BY pm.start_date DESC) AS rn  .....
 

SELECT *  
FROM   (SELECT p.last_name,  
               p.first_name,  
               p.person_id,  
               pe.enc_id,  
               pe.enc_timestamp,  
               pm.start_date,  
               pm.date_stopped,  
               pm.date_last_refilled,  
               fm.brand_name,  
               fm.generic_name,  
               fm.dose,  
               row_number() OVER (PARTITION BY p.person_id ORDER BY pm.start_date DESC) AS rn  
        FROM   patient_medication AS pm WITH (NOLOCK) INNER JOIN  
               fdb_medication AS fm WITH (NOLOCK)  
               ON pm.ndc_id = fm.ndc_id INNER JOIN  
               patient_encounter AS pe WITH (NOLOCK)  
               ON pm.enc_id = pe.enc_id INNER JOIN  
               person AS p  
               ON pe.person_id = p.person_id  
        WHERE  (pm.date_stopped = '' OR pm.date_stopped IS NULL)  
               OR (pm.date_stopped > getdate())  
               AND (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN')) 
       ) AS x  
WHERE  rn = 1;

Open in new window

0
 
js-profiCommented:
i used an oracle dbms for the distinct. don't know correct syntax for access sql. distinct pm.person_id would suppress second row with same person_id. You also could try distinct p.person_id.
the 'with' is from your select posted. don't know what it's purpose. i hate those propertiary syntax rules.
0
 
biotecAuthor Commented:
Somewhere in all of this my parens where changed on the AND (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN') or something and now it is pulling all generics. In other words what I had at the beginning worked and now it seems that the most recent query may have worked except the 700 plus rows returned included records that don't belong.
0
 
biotecAuthor Commented:
Thanks js-profi. Yes I don't think you can do distinct like that. There were some other issues as well.
0
 
ralmadaCommented:
have you read my last two comments? 26193034 and 26193070?
0
 
SharathData EngineerCommented:
HainKurt and Ralmada directing you in right way. if you still not getting your required result, you are missing something. Post some sample data and expected result.
0
 
biotecAuthor Commented:
When running the most recent version of the query and putting back in my original where statement the dups go away and it does return what I think it should. The question is what you found in my original statement that you thought was wrong. Just because the numbers look ok, doesn't mean I didn't have a mistake in my query. Thanks

SELECT *  
FROM   (SELECT p.last_name,  
               p.first_name,  
               p.person_id,  
               pe.enc_id,  
               pe.enc_timestamp,  
               pm.start_date,  
               pm.date_stopped,  
               pm.date_last_refilled,  
               fm.brand_name,  
               fm.generic_name,  
               fm.dose,  
               row_number() OVER (PARTITION BY p.person_id ORDER BY pm.start_date DESC) AS rn  
        FROM patient_medication pm with (nolock)
INNER JOIN fdb_medication fm with (nolock)  ON pm.ndc_id = fm.ndc_id
INNER JOIN patient_encounter pe with (nolock)  on pm.enc_id = pe.enc_id
inner join person p on pe.person_id = p.person_id
WHERE --pm.person_id = p.person_id
----and (pe.enc_timestamp BETWEEN @b and @e)
--( pm.start_date <= @eNG and
((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate()))--  or ltrim(rtrim(isnull(pm.date_stopped,'')))  = '')
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))
       ) AS x  
WHERE  rn = 1;
0
 
HainKurtSr. System AnalystCommented:
"If they are on multiple meds I only want to include the one that has the most recent pm.start_date"

can you please explain this a bit more...

maybe you should modify this line

row_number() OVER (PARTITION BY p.person_id ORDER BY pm.start_date DESC) AS rn
-->
row_number() OVER (PARTITION BY p.person_id, fm.generic_name ORDER BY pm.start_date DESC) AS rn

how many records you get from your original query/view and how many you should get?
0
 
biotecAuthor Commented:
Another question on the query where is uses AS x....why did you have to use the AS x if you don't call it anywhere else? I'll split points on this.
0
 
ralmadaCommented:
Your original where clause was like this:
((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))  
which will give you syntax error. We updated it like this:

(pm.date_stopped = '' OR pm.date_stopped IS NULL)
OR (pm.date_stopped > getdate())
AND (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))  
As you can se I've removed the first bracket and added some missing brackets after getdate..
0
 
ralmadaCommented:
>>Another question on the query where is uses AS x....why did you have to use the AS x if you don't call it anywhere else? I'll split points on this. <<
You need the alias there in order for the subquery to work properly. It's the way SQL syntax is constructed
select * from
(
......
) a -- Need an alias here in order for it to work.
0
 
biotecAuthor Commented:
Ahh. Yes, you are correct, when I copied it I missed the parens after getdate, however they are there. Something happened though when I started copying what people changed. So for now I think if you think the below looks correct then I should be fine.
WHERE ((pm.date_stopped = '' OR pm.date_stopped is null)or (pm.date_stopped > getdate()))'
and (fm.generic_name IN ('HYDROCODONE BIT/ACETAMINOPHEN', 'OXYCODONE HCL/ACETAMINOPHEN'))
0
 
biotecAuthor Commented:
Thanks for quick and thorough solutions.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 13
  • 12
  • 5
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now