?
Solved

Remove duplicate results from MS SQL Query when using Joins.

Posted on 2013-05-16
18
Medium Priority
?
432 Views
Last Modified: 2013-05-30
What I need to do is take the following information and only select distinct values for oe_line.customer_part_number.

My Query in MS SQL is as follows. I've also included a "WHERE" clause for diagnostic purposes:

SELECT oe_line.customer_part_number AS 'Customer Part Number [1]', 
	invoice_hdr.po_no AS 'PO Number [2]', 
	oe_line.qty_allocated AS 'Qty Allocated [3]', 
	oe_line.line_no AS 'PO Line Number [4]', 
	invoice_line.item_id AS "Item ID / Serial # [5]", 
	invoice_line.item_desc AS "Description [6]"
	
	FROM [P21].[dbo].[oe_line]
	RIGHT OUTER JOIN [P21].[dbo].[invoice_line]
	ON oe_line.order_no = invoice_line.order_no
	RIGHT OUTER JOIN [P21].[dbo].[invoice_hdr]
	ON oe_line.order_no = invoice_hdr.order_no
	WHERE oe_line.order_no = '1003227'

Open in new window


The results obtained are as follows, truncated to give a basic idea of what is happening:

157828-1,LLSS07454,0.000000000,1,DC-750-35,Quick Coupler Check Valve Assys.
2008153,LLSS07454,2.000000000,2,DC-750-35,Quick Coupler Check Valve Assys.
B511BDB48C,LLSS07454,12.000000000,3,DC-750-35,Quick Coupler Check Valve Assys.
171902-1,LLSS07454,2.000000000,4,DC-750-35,Quick Coupler Check Valve Assys.
253443-1,LLSS07454,14.000000000,5,DC-750-35,Quick Coupler Check Valve Assys.
253444-1,LLSS07454,2.000000000,6,DC-750-35,Quick Coupler Check Valve Assys.
BM-38,LLSS07454,4.000000000,7,DC-750-35,Quick Coupler Check Valve Assys.
219P-4,LLSS07454,14.000000000,8,DC-750-35,Quick Coupler Check Valve Assys.
219P-6,LLSS07454,6.000000000,9,DC-750-35,Quick Coupler Check Valve Assys.
HARSCO AIR VALVE ASSYS.,LLSS07454,18.000000000,10,DC-750-35,Quick Coupler Check Valve Assys.
2013625,LLSS07454,0.000000000,11,DC-750-35,Quick Coupler Check Valve Assys.
250975-1,LLSS07454,0.000000000,12,DC-750-35,Quick Coupler Check Valve Assys.
157828-1,LLSS07454,0.000000000,1,2013625,P1M DUPLEX CYL.
2008153,LLSS07454,2.000000000,2,2013625,P1M DUPLEX CYL.
B511BDB48C,LLSS07454,12.000000000,3,2013625,P1M DUPLEX CYL.
171902-1,LLSS07454,2.000000000,4,2013625,P1M DUPLEX CYL.
253443-1,LLSS07454,14.000000000,5,2013625,P1M DUPLEX CYL.
253444-1,LLSS07454,2.000000000,6,2013625,P1M DUPLEX CYL.
BM-38,LLSS07454,4.000000000,7,2013625,P1M DUPLEX CYL.
219P-4,LLSS07454,14.000000000,8,2013625,P1M DUPLEX CYL.
219P-6,LLSS07454,6.000000000,9,2013625,P1M DUPLEX CYL.
HARSCO AIR VALVE ASSYS.,LLSS07454,18.000000000,10,2013625,P1M DUPLEX CYL.
2013625,LLSS07454,0.000000000,11,2013625,P1M DUPLEX CYL.
250975-1,LLSS07454,0.000000000,12,2013625,P1M DUPLEX CYL.

Open in new window



I tried grouping these results as I would if I were using MySQL and apparently MS SQL forces you to aggregate columns so this is not a good solution.
0
Comment
Question by:advantagec
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
  • 2
  • +2
18 Comments
 
LVL 8

Expert Comment

by:didnthaveaname
ID: 39172141
I'm a little confused by what you want to accomplish.  Can you maybe modify the output to show what you're looking for?  You only have to use aggregate functions for what is not specified in the group by clause (the set of attributes that you use to define your grouping set)
0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39172159
I am not sure what you are after, but if you need only distinct customer part number then the below query will help

SELECT distinct oe_line.customer_part_number AS 'Customer Part Number [1]', 
	FROM [P21].[dbo].[oe_line]
	RIGHT OUTER JOIN [P21].[dbo].[invoice_line]
	ON oe_line.order_no = invoice_line.order_no
	RIGHT OUTER JOIN [P21].[dbo].[invoice_hdr]
	ON oe_line.order_no = invoice_hdr.order_no
	WHERE oe_line.order_no = '1003227'

Open in new window

0
 

Author Comment

by:advantagec
ID: 39172183
Neo_jarvis, I need all the other data that goes along with that Distinct customer part number.

Here is the output to your query :
157828-1
171902-1
2008153
2013625
219P-4
219P-6
250975-1
253443-1
253444-1
B511BDB48C
BM-38
HARSCO AIR VALVE ASSYS.

Open in new window

0
Get real performance insights from real users

Key features:
- Total Pages Views and Load times
- Top Pages Viewed and Load Times
- Real Time Site Page Build Performance
- Users’ Browser and Platform Performance
- Geographic User Breakdown
- And more

 

Author Comment

by:advantagec
ID: 39172185
I need 12 total results with those distinct part numbers.
0
 
LVL 8

Expert Comment

by:didnthaveaname
ID: 39172208
I understand what you're wanting now, but I'm not sure why the aggregate functions were causing an issue.  With the additional information, all of those results are not distinct, so the distinct clause won't filter them.  You can further define the granularity of your grouping clause to get what you want, though.
0
 
LVL 32

Expert Comment

by:awking00
ID: 39172213
It appears that all of your selected columns have distinct values except for the item_description. If you want to only have one customer_part_number per line, then you need to establish the criteria to determine (e.g. min or max) which item_description you want.
0
 

Author Comment

by:advantagec
ID: 39172234
The expected output is:

Distinct:
[dbo].[invoice_line].[line_no] = PO Line Number [4]

Distinct, based on line_no:
[dbo].[invoice_line].[customer_part_number] = Customer Part Number [1]
[dbo].[invoice_line].[qty_shipped] = Qty Shipped / Allocated [3]
[dbo].[invoice_line].[item_id] = Item ID / Serial Number [5]
[dbo].[invoice_line].[item_desc] = Description [6]

Same across results:
[dbo].[invoice_hdr].[po_no] = PO Number [2]
0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39172240
hello there,

I understood your problem, but still I need some more clarity.

what the result should be for the part number 253443-1 ???
253443-1,LLSS07454,14.000000000,5,2013625,P1M DUPLEX CYL.
253444-1,LLSS07454,2.000000000,6,2013625,P1M DUPLEX CYL.

Looking at your code, I guess this should be as below, the sum of the quantity,
253443-1,LLSS07454,16.000000000,5,2013625,P1M DUPLEX CYL.
if that is the case then you can use the below query

SELECT oe_line.customer_part_number AS 'Customer Part Number [1]', 
	invoice_hdr.po_no AS 'PO Number [2]', 
	sum(oe_line.qty_allocated) AS 'Qty Allocated [3]', 
	oe_line.line_no AS 'PO Line Number [4]', 
	invoice_line.item_id AS "Item ID / Serial # [5]", 
	invoice_line.item_desc AS "Description [6]"
	FROM [P21].[dbo].[oe_line]
	RIGHT OUTER JOIN [P21].[dbo].[invoice_line]
	ON oe_line.order_no = invoice_line.order_no
	RIGHT OUTER JOIN [P21].[dbo].[invoice_hdr]
	ON oe_line.order_no = invoice_hdr.order_no
	WHERE oe_line.order_no = '1003227'
group by oe_line.customer_part_number,invoice_hdr.po_no,oe_line.line_no,invoice_line.item_id,invoice_line.item_desc 

Open in new window

0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39172254
it would be nice if you can also post your expected results than just explaining it through code.
0
 

Author Comment

by:advantagec
ID: 39172260
I'm working on that right now, it takes a while. I've got to parse 12 different queries and string together all the results. Stand by.
0
 

Author Comment

by:advantagec
ID: 39172347
Actual data, expected results:
customer_part_number, po_no, qty_allocated, line_no, item_id, item_desc

157828-1,LLSS07454,0.000000000,1,DC-750-35,Quick Coupler Check Valve Assys.
2008153,LLSS07454,2.000000000,2,2008153,7 Station B5 Air Valve Assembly
B511BDB48C,LLSS07454,12.000000000,3,NULL,NULL
171902-1,LLSS07454,2.000000000,4,NULL,NULL
253443-1,LLSS07454,14.000000000,5,NULL,NULL
253444-1,LLSS07454,2.000000000,6,NULL,NULL
BM-38,LLSS07454,4.000000000,7,NULL,NULL
219P-4,LLSS07454,14.000000000,8,NULL,NULL
219P-6,LLSS07454,6.000000000,9,NULL,NULL
HARSCO AIR VALVE ASSYS.,LLSS07454,18.000000000,10,NULL,NULL
2013625,LLSS07454,0.000000000,11,2013625,P1M DUPLEX CYL
250975-1,LLSS07454,0.000000000,12,932638Q,31P5QHELEMASSYVITON

Open in new window

0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39172378
I hope this helps

;with cte as 
(
SELECT row_number() over (partition by oe_line.customer_part_number order by (select 1) ) as rn oe_line.customer_part_number AS 'Customer Part Number [1]', 
	invoice_hdr.po_no AS 'PO Number [2]', 
	oe_line.qty_allocated AS 'Qty Allocated [3]', 
	oe_line.line_no AS 'PO Line Number [4]', 
	invoice_line.item_id AS "Item ID / Serial # [5]", 
	invoice_line.item_desc AS "Description [6]"
	
	FROM [P21].[dbo].[oe_line]
	RIGHT OUTER JOIN [P21].[dbo].[invoice_line]
	ON oe_line.order_no = invoice_line.order_no
	RIGHT OUTER JOIN [P21].[dbo].[invoice_hdr]
	ON oe_line.order_no = invoice_hdr.order_no
	WHERE oe_line.order_no = '1003227'
)
select * from cte where rn = 1
)

Open in new window

0
 

Author Comment

by:advantagec
ID: 39172396
I'm getting

Msg 102, Level 15, State 1, Line 3
Incorrect syntax near 'oe_line'.
Msg 102, Level 15, State 1, Line 18
Incorrect syntax near ')'.
0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39172421
Sorry, missed a comma in the above statement, corrected below
;with cte as 
(
SELECT row_number() over (partition by oe_line.customer_part_number order by (select 1) ) as rn, oe_line.customer_part_number AS 'Customer Part Number [1]', 
	invoice_hdr.po_no AS 'PO Number [2]', 
	oe_line.qty_allocated AS 'Qty Allocated [3]', 
	oe_line.line_no AS 'PO Line Number [4]', 
	invoice_line.item_id AS "Item ID / Serial # [5]", 
	invoice_line.item_desc AS "Description [6]"
	
	FROM [P21].[dbo].[oe_line]
	RIGHT OUTER JOIN [P21].[dbo].[invoice_line]
	ON oe_line.order_no = invoice_line.order_no
	RIGHT OUTER JOIN [P21].[dbo].[invoice_hdr]
	ON oe_line.order_no = invoice_hdr.order_no
	WHERE oe_line.order_no = '1003227'
)
select * from cte where rn = 1

Open in new window

0
 

Author Comment

by:advantagec
ID: 39172460
It's almost perfect, except for some reason the Item ID and Description just repeat over and over...

rn,Customer Part Number [1],PO Number [2],Qty Allocated [3],PO Line Number [4],Item ID / Serial # [5],Description [6]
1,157828-1,LLSS07454,0.000000000,1,DC-750-35,Quick Coupler Check Valve Assys.
1,171902-1,LLSS07454,2.000000000,4,932638Q,31P5QHELEMASSYVITON
1,2008153,LLSS07454,2.000000000,2,DC-750-35,Quick Coupler Check Valve Assys.
1,2013625,LLSS07454,0.000000000,11,932638Q,31P5QHELEMASSYVITON
1,219P-4,LLSS07454,14.000000000,8,DC-750-35,Quick Coupler Check Valve Assys.
1,219P-6,LLSS07454,6.000000000,9,932638Q,31P5QHELEMASSYVITON
1,250975-1,LLSS07454,0.000000000,12,DC-750-35,Quick Coupler Check Valve Assys.
1,253443-1,LLSS07454,14.000000000,5,932638Q,31P5QHELEMASSYVITON
1,253444-1,LLSS07454,2.000000000,6,DC-750-35,Quick Coupler Check Valve Assys.
1,B511BDB48C,LLSS07454,12.000000000,3,932638Q,31P5QHELEMASSYVITON
1,BM-38,LLSS07454,4.000000000,7,DC-750-35,Quick Coupler Check Valve Assys.
1,HARSCO AIR VALVE ASSYS.,LLSS07454,18.000000000,10,932638Q,31P5QHELEMASSYVITON

Open in new window

0
 
LVL 16

Expert Comment

by:Surendra Nath
ID: 39172470
I didn't understand what you meant, can you please elaborate, I cannot see that happening.
0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 39173745
under revision, sorry, will post a new comment later
0
 
LVL 49

Accepted Solution

by:
PortletPaul earned 1000 total points
ID: 39173797
>>It's almost perfect, except for some reason the Item ID and Description just repeat over and over...
I started a comment - then looked back over the information only to realize that I think your joins are potentially wrong, and that is why your descriptions are repeating I think.

You are dealing with order/line-entry information, yet your joins only specify order numbers (order_no = order_no)

FROM [P21].[dbo].[oe_line]
RIGHT JOIN [P21].[dbo].[invoice_line]
      ON oe_line.order_no = invoice_line.order_no
RIGHT JOIN [P21].[dbo].[invoice_hdr]
      ON oe_line.order_no = invoice_hdr.order_no

/* there is something missing I think, like this */
            AND oe_line.line_no = invoice_line.line_no


WHERE oe_line.order_no = '1003227'

Additionally, it should not be possible to have a line entry without a line and no line should exist without a header. I would suggest you turn your joins not only into inner joins but perhaps reverse the direction, from header to line, and from line to line entry.
e.g.


FROM [P21].[dbo].[invoice_hdr]
INNER JOIN [P21].[dbo].[invoice_line]  ON invoice_hdr.order_no = oe_line.order_no
INNER JOIN [P21].[dbo].[oe_line]       ON invoice_line.order_no = oe_line.order_no

/* there is something missing I think, like this */
            AND oe_line.line_no = invoice_line.line_no


WHERE oe_line.order_no = '1003227'

You really need to sort this out, then the issue of unwanted repetition may be resolved without resorting to group by or row_number().

Please also remember, the SQL reserved word DISTINCT had one meaning, that "the whole row will be unique" i.e. DISTINCT considers ALL columns.
0

Featured Post

Get real performance insights from real users

Key features:
- Total Pages Views and Load times
- Top Pages Viewed and Load Times
- Real Time Site Page Build Performance
- Users’ Browser and Platform Performance
- Geographic User Breakdown
- And more

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
In the first part of this tutorial we will cover the prerequisites for installing SQL Server vNext on Linux.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Suggested Courses

801 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question