Solved

# Issue on Join Query - SQL

Posted on 2011-03-06
255 Views
SQL 2008 / 2005

On executing  Table : TransactionReport -> Total # of rows affected is 671

After Joining with Plntrak PT : - Through Column : Rxnbr

I am getting 713 Rows.

Expected output is only 671. Where does this duplication occurs ? How to find these duplicate rows.
``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill

from [TransactionReport] TR
Left Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
``````
0
Question by:chokka
• 4
• 3
• 2
• +3

LVL 3

Expert Comment

The duplication is due to there being more than one row in Plntrak which matches a row on TransactionReport
e.g.
TR.RxNbr            PT.RxNbr
0                         0                   unique match
1                                              no match
2                         2
3                         2                   non-unique matches
2
3                   unique match

To find the duplicates you need to join the tables and do a count where the match is greater than 1

``````SELECT TR.RxNbr
FROM [TransactionReport] TR
Left Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
GROUP BY TR.RxNbr
HAVING COUNT(*) >1
``````
0

LVL 32

Expert Comment

Left join will get you all the records in the Report table

``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill

from [TransactionReport] TR
inner Join Plntrak PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
``````
0

LVL 75

Expert Comment

Left join will get you all the records in the Report table<<
I am afraid that is not correct based on the definition of a LEFT JOIN.  Let me know if you want me to elaborate.
0

LVL 75

Accepted Solution

Anthony Perkins earned 100 total points
Actually I misread what you stated.  But while it is true that it will return all the rows in the TransactionReport table, that is not the problem.  We know that there are 617 rows in that table.  The problem is that as suggested before (http:#a35047455) there is "more than one row in Plntrak which matches a row on TransactionReport"
0

Author Comment

ewangoya:- I am confused on your suggestion.

For the Table : TransactionReport we need " PT.Rebill " Value for all the RxNbr which is there in Transaction Report.  Suppose if Plntrak PT  doesnt have the matching RxNbr, we just need atleast NULL Value for the Column PT.Rebill.

So, i decided to go for Left Join.

If i use Inner Join - I believe only matching RxNbr rows will be fetched.

0

LVL 40

Assisted Solution

Sharath earned 100 total points
As kraiven asked, do you have duplicates in Plntrak  table? If so, how do you want one record for RxNbr? Do you have any date field/ unique id to identify one record?
0

LVL 3

Assisted Solution

kraiven earned 100 total points
chokka,

You asked how to find the duplicates originally - well the grouping select I posted will give you a list of all the rows in PlnTrak which have more than one row for a given RxNbr. In my solution I used you WHERE clause with its LEFT JOIN but an INNER JOIN would have sufficed.

``````SELECT TR.RxNbr
FROM [TransactionReport] TR
INNER Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
and  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
GROUP BY TR.RxNbr
HAVING COUNT(*) >1
``````
0

LVL 32

Assisted Solution

ewangoya earned 200 total points

@acperkins
You are right

My apologies, I misunderstood the question.

Now it all depends on what the author wants to do in case of multiple records in the Plntrak table
You can select a distinct record or top most record

Here is a suggestion to try
``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill
from [TransactionReport] TR
left Join (select distinct RxNbr
from Plntrak
where  PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')) PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
``````
0

LVL 3

Expert Comment

ewangoya,

I'm not sure your SQL above would even execute. The outer SELECT requires PT.Rebill in it's return data but your PT has only RxNbr and you reference the "PT" alias, in the inner WHERE clause, before it is even defined.
Besides, I think chokka just wants to know why there are more than 671 rows returned, and how to find out what the extra rows are.

Perhaps some input from chokka would help?
0

LVL 4

Expert Comment

can you try this:

select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees]
,PT.Rebill

from [TransactionReport] TR
inner Join Plntrak PT
on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
where PT.Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND   YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
0

LVL 32

Expert Comment

You cant use inner join
``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees],
PT.Rebill
from [TransactionReport] TR
left Join (select MAX(RxNbr) RxNbr, Max(Filldate) Filldate
from Plntrak
where  Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
group by RxNbr, Filldate) PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
``````
0

LVL 32

Assisted Solution

ewangoya earned 200 total points
My mistake, you need to add Rebill to the query

(And I'm assuming you want to show the most recent values)

``````select
TR.[Last Name],
TR.[First Name],
TR.[Last 4 Digit of SSN],
TR.[ZipCode],
TR.[County],
TR.[340B],
TR.[DOB],
TR.[Client ID],
TR.[Insurance Status],
TR.[PlanId],
TR.[Generic Code],
TR.[Drug Name],
TR.[NDC],
TR.[RxNbr],
TR.[Fill Date],
TR.[Dispensed Qty],
TR.[Refills],
TR.[Days Supplied],
TR.[Plan Allows],
TR.[Copay],
TR.[Net due From Insurance],
TR.[340B Price],
TR.[340B Packet Size],
TR.[340B Cost for this RX],
TR.[Cardinal Price],
TR.[Packet Size],
TR.[Dispensing Fees],
PT.Rebill
from [TransactionReport] TR
left Join (select MAX(RxNbr) RxNbr, Max(Filldate) Filldate, Max(Rebill) Rebill
from Plntrak
where  Filldate BETWEEN YEAR('01/01/2011')*1000 + DATEPART(dy, '01/01/2011')
AND YEAR('01/31/2011')*1000 + DATEPART(dy, '01/31/2011')
group by RxNbr, Filldate) PT on  PT.RxNbr  = CONVERT(Int, TR.RxNbr)
``````
0

## Featured Post

### Suggested Solutions

If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
In this article I will describe the Copy Database Wizard method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties