• Status: Solved
• Priority: Medium
• Security: Public
• Views: 295

# Issue on Join Query - SQL

SQL 2008 / 2005

On executing  Table : TransactionReport -> Total # of rows affected is 671

After Joining with Plntrak PT : - Through Column : Rxnbr

I am getting 713 Rows.

Expected output is only 671. Where does this duplication occurs ? How to find these duplicate rows.
``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill

from [TransactionReport] TR
Left Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
``````
0
chokka
• 4
• 3
• 2
• +3
5 Solutions

Commented:
The duplication is due to there being more than one row in Plntrak which matches a row on TransactionReport
e.g.
TR.RxNbr            PT.RxNbr
0                         0                   unique match
1                                              no match
2                         2
3                         2                   non-unique matches
2
3                   unique match

To find the duplicates you need to join the tables and do a count where the match is greater than 1

``````SELECT TR.RxNbr
FROM [TransactionReport] TR
Left Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
GROUP BY TR.RxNbr
HAVING COUNT(*) >1
``````
0

Commented:
Left join will get you all the records in the Report table

``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill

from [TransactionReport] TR
inner Join Plntrak PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
``````
0

Commented:
Left join will get you all the records in the Report table<<
I am afraid that is not correct based on the definition of a LEFT JOIN.  Let me know if you want me to elaborate.
0

Commented:
Actually I misread what you stated.  But while it is true that it will return all the rows in the TransactionReport table, that is not the problem.  We know that there are 617 rows in that table.  The problem is that as suggested before (http:#a35047455) there is "more than one row in Plntrak which matches a row on TransactionReport"
0

Author Commented:
ewangoya:- I am confused on your suggestion.

For the Table : TransactionReport we need " PT.Rebill " Value for all the RxNbr which is there in Transaction Report.  Suppose if Plntrak PT  doesnt have the matching RxNbr, we just need atleast NULL Value for the Column PT.Rebill.

So, i decided to go for Left Join.

If i use Inner Join - I believe only matching RxNbr rows will be fetched.

0

Data EngineerCommented:
As kraiven asked, do you have duplicates in Plntrak  table? If so, how do you want one record for RxNbr? Do you have any date field/ unique id to identify one record?
0

Commented:
chokka,

You asked how to find the duplicates originally - well the grouping select I posted will give you a list of all the rows in PlnTrak which have more than one row for a given RxNbr. In my solution I used you WHERE clause with its LEFT JOIN but an INNER JOIN would have sufficed.

``````SELECT TR.RxNbr
FROM [TransactionReport] TR
INNER Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
GROUP BY TR.RxNbr
HAVING COUNT(*) >1
``````
0

Commented:

@acperkins
You are right

My apologies, I misunderstood the question.

Now it all depends on what the author wants to do in case of multiple records in the Plntrak table
You can select a distinct record or top most record

Here is a suggestion to try
``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill
from [TransactionReport] TR
left Join (select distinct RxNbr
from Plntrak
where  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')) PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
``````
0

Commented:
ewangoya,

I'm not sure your SQL above would even execute. The outer SELECT requires PT.Rebill in it's return data but your PT has only RxNbr and you reference the "PT" alias, in the inner WHERE clause, before it is even defined.
Besides, I think chokka just wants to know why there are more than 671 rows returned, and how to find out what the extra rows are.

Perhaps some input from chokka would help?
0

Commented:
can you try this:

select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill

from [TransactionReport] TR
inner Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
where PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
0

Commented:
You cant use inner join
``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees],
PT.Rebill
from [TransactionReport] TR
left Join (select MAX(RxNbr) RxNbr, Max(Filldate) Filldate
from Plntrak
where  Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
group by RxNbr, Filldate) PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
``````
0

Commented:
My mistake, you need to add Rebill to the query

(And I'm assuming you want to show the most recent values)

``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees],
PT.Rebill
from [TransactionReport] TR
left Join (select MAX(RxNbr) RxNbr, Max(Filldate) Filldate, Max(Rebill) Rebill
from Plntrak
where  Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
group by RxNbr, Filldate) PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
``````
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.