Solved

SQL DISTINCT WITH JOIN

Posted on 2014-04-14
13
318 Views
Last Modified: 2014-04-15
Hi guys,

I have the below query which sort of works but I only want to return unique records based on email_1 and full_name columns from the p_contacts table?

SELECT        f.customer_ID, e.f_name, e.l_name, e.f_name + ' ' + e.l_name AS full_name_p, NULL AS JobTitle, e.tel_1, e.tel_2, e.mobile_1, e.email_1, e.email_2, 
                         5 AS TypeID, 'Auto added Test' AS Note, GETDATE() AS CreatedDate, 0 AS IsInvoice, NULL AS Photo, 0 AS IsAcc, 0 AS IsPrimary
FROM            dbo.p_companies AS f RIGHT OUTER JOIN
                         dbo.p_contacts AS e ON f.id = e.org_id
WHERE       (f.customer_ID =36) AND (e.f_name NOT LIKE 'Customer') AND (e.email_1 IS NOT NULL) AND (e.email_1 <> '') AND (e.f_name IS NOT NULL) AND (NOT EXISTS
                             (SELECT        contact_ID, customer_ID, contact_first_name, contact_last_name, contact_full_name, contact_job_title, contact_tel1, contact_tel2, contact_mobile, 
                                                         contact_mail, contact_alt_mail, contact_type_ID, contact_note, contact_created_date, contact_is_invoice, contact_photo, contact_is_primary, 
                                                         contact_is_acc
                               FROM            dbo.tblContacts
                               WHERE        (contact_mail = e.email_1) AND (contact_full_name = e.f_name + ' ' + e.l_name)));

Open in new window


Many thanks for your help on this matter
0
Comment
Question by:databarracks
  • 7
  • 3
  • 2
  • +1
13 Comments
 
LVL 32

Expert Comment

by:awking00
ID: 39999067
Does the following return non-unique records?
select org_id, f_name, lname, email_1
from dbo.p_contacts;
0
 

Author Comment

by:databarracks
ID: 39999080
Hi there,

No it will not return unique records I am trying out a row over partition method at the moment which would identify duplicate records based on the fields I would declare as uniqe

row_number() OVER(PARTITION BY e.email_1,e.full_name ORDER BY email_1) rn

Open in new window


This works but I don't know where to put it in my original query, this is what I have done now but don't know where to put my where clause to filter rn = 1


SELECT        f.customer_ID, e.f_name, e.l_name, e.f_name + ' ' + e.l_name AS full_name_p, NULL AS JobTitle, e.tel_1, e.tel_2, e.mobile_1, e.email_1, e.email_2, 
                         5 AS TypeID, 'Auto added Test' AS Note, GETDATE() AS CreatedDate, 0 AS IsInvoice, NULL AS Photo, 0 AS IsAcc, 0 AS IsPrimary,row_number() OVER(PARTITION BY e.email_1,e.full_name ORDER BY email_1) rn
FROM            dbo.p_companies AS f RIGHT OUTER JOIN
                         dbo.p_contacts AS e ON f.id = e.org_id
WHERE       (f.customer_ID =36) AND (e.f_name NOT LIKE 'Customer') AND (e.email_1 IS NOT NULL) AND (e.email_1 <> '') AND (e.f_name IS NOT NULL) AND (NOT EXISTS
                             (SELECT        contact_ID, customer_ID, contact_first_name, contact_last_name, contact_full_name, contact_job_title, contact_tel1, contact_tel2, contact_mobile, 
                                                         contact_mail, contact_alt_mail, contact_type_ID, contact_note, contact_created_date, contact_is_invoice, contact_photo, contact_is_primary, 
                                                         contact_is_acc
                               FROM            dbo.tblContacts
                               WHERE        (contact_mail = e.email_1) AND (contact_full_name = e.f_name + ' ' + e.l_name)));

Open in new window

0
 

Author Comment

by:databarracks
ID: 39999121
Ok I think I figured it out this is what I did, please let me know if you have a better way, however this is working

SELECT        f.customer_ID, e.f_name, e.l_name, e.f_name + ' ' + e.l_name AS full_name_p, NULL AS JobTitle, e.tel_1, e.tel_2, e.mobile_1, e.email_1, e.email_2, 
                         5 AS TypeID, 'Auto added Test' AS Note, GETDATE() AS CreatedDate, 0 AS IsInvoice, NULL AS Photo, 0 AS IsAcc, 0 AS IsPrimary,e.rn
FROM            dbo.p_companies AS f RIGHT OUTER JOIN
                         (SELECT *,row_number() OVER(PARTITION BY email_1,full_name ORDER BY email_1) rn FROM dbo.p_contacts) AS e ON f.id = e.org_id
WHERE      (rn =1) AND (f.customer_ID =36) AND (e.f_name NOT LIKE 'Customer') AND (e.email_1 IS NOT NULL) AND (e.email_1 <> '') AND (e.f_name IS NOT NULL) AND (NOT EXISTS
                             (SELECT        contact_ID, customer_ID, contact_first_name, contact_last_name, contact_full_name, contact_job_title, contact_tel1, contact_tel2, contact_mobile, 
                                                         contact_mail, contact_alt_mail, contact_type_ID, contact_note, contact_created_date, contact_is_invoice, contact_photo, contact_is_primary, 
                                                         contact_is_acc
                               FROM            dbo.tblContacts
                               WHERE        (contact_mail = e.email_1) AND (contact_full_name = e.f_name + ' ' + e.l_name)));

Open in new window

0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39999229
please read up this article that explains the "issue" in general, and shows the path to success:
http://www.experts-exchange.com/Database/Miscellaneous/A_3203-DISTINCT-vs-GROUP-BY-and-why-does-it-not-work-for-my-query.html
0
 
LVL 26

Expert Comment

by:skullnobrains
ID: 39999543
why don't you use a group by clause ?
0
 

Author Comment

by:databarracks
ID: 39999552
Hi there, could you show me how to do that because my group by clause still bring duplicate values. BEcause my full_name and email_1 field names need to be distinct within the org_id column?

No idea, how to do that?
0
[Webinar] Disaster Recovery and Cloud Management

Learn from Unigma and CloudBerry industry veterans which providers are best for certain use cases and how to lower cloud costs, how to grow your Managed Services practice in IaaS clouds, and how to utilize public cloud for Disaster Recovery

 

Author Comment

by:databarracks
ID: 39999561
Because I need to return all the rows from the p_contacts table so that I can insert them into my tblContacts table. I have tried sub queries the lot but I can't figure this out. GROUP BY won't enable me to return all fields from p_contacts table would it or.....?????
0
 
LVL 26

Assisted Solution

by:skullnobrains
skullnobrains earned 250 total points
ID: 39999579
yes, as far as i know you cannot use a group by inside a subquery (but you should try to make sure)

in that case, obviously an sqlt script would allow what you want but a simple workaround could be to build a view on p_contacts with the required group by clause and use the view in your query
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40000849
you can use group by in a subquery, but most often it's not the right thing.
my article shows exactly how to do this, by adding a "ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ...) rn  " into the subquery, and adding a RN = 1 into the joining condition.
0
 

Author Comment

by:databarracks
ID: 40000890
I have used the row_number over partition as in my earlier code ID: 39999121 but the problem with that is it won't work with my dataset in visual studio as it says the over function is not supported even though I am using sql server 2012.
0
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 250 total points
ID: 40000899
you can always "hide" the sql code complexity from your visual studio code by putting it into a view or a stored procedure (which is something I recommend, anyhow) ...
0
 

Author Comment

by:databarracks
ID: 40001029
Hi there i solved the problem using a view instead. Many thanks for all of your help guys.
0
 

Author Closing Comment

by:databarracks
ID: 40001031
Very good help
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Entering time in Microsoft Access can be difficult. An input mask often bothers users more than helping them and won't catch all typing errors. This article shows how to create a textbox for 24-hour time input with full validation politely catching …
How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
This is Part 3 in a 3-part series on Experts Exchange to discuss error handling in VBA code written for Excel. Part 1 of this series discussed basic error handling code using VBA. http://www.experts-exchange.com/videos/1478/Excel-Error-Handlin…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

895 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now