Solved

SQL DISTINCT WITH JOIN

Posted on 2014-04-14
13
326 Views
Last Modified: 2014-04-15
Hi guys,

I have the below query which sort of works but I only want to return unique records based on email_1 and full_name columns from the p_contacts table?

SELECT        f.customer_ID, e.f_name, e.l_name, e.f_name + ' ' + e.l_name AS full_name_p, NULL AS JobTitle, e.tel_1, e.tel_2, e.mobile_1, e.email_1, e.email_2, 
                         5 AS TypeID, 'Auto added Test' AS Note, GETDATE() AS CreatedDate, 0 AS IsInvoice, NULL AS Photo, 0 AS IsAcc, 0 AS IsPrimary
FROM            dbo.p_companies AS f RIGHT OUTER JOIN
                         dbo.p_contacts AS e ON f.id = e.org_id
WHERE       (f.customer_ID =36) AND (e.f_name NOT LIKE 'Customer') AND (e.email_1 IS NOT NULL) AND (e.email_1 <> '') AND (e.f_name IS NOT NULL) AND (NOT EXISTS
                             (SELECT        contact_ID, customer_ID, contact_first_name, contact_last_name, contact_full_name, contact_job_title, contact_tel1, contact_tel2, contact_mobile, 
                                                         contact_mail, contact_alt_mail, contact_type_ID, contact_note, contact_created_date, contact_is_invoice, contact_photo, contact_is_primary, 
                                                         contact_is_acc
                               FROM            dbo.tblContacts
                               WHERE        (contact_mail = e.email_1) AND (contact_full_name = e.f_name + ' ' + e.l_name)));

Open in new window


Many thanks for your help on this matter
0
Comment
Question by:databarracks
  • 7
  • 3
  • 2
  • +1
13 Comments
 
LVL 32

Expert Comment

by:awking00
ID: 39999067
Does the following return non-unique records?
select org_id, f_name, lname, email_1
from dbo.p_contacts;
0
 

Author Comment

by:databarracks
ID: 39999080
Hi there,

No it will not return unique records I am trying out a row over partition method at the moment which would identify duplicate records based on the fields I would declare as uniqe

row_number() OVER(PARTITION BY e.email_1,e.full_name ORDER BY email_1) rn

Open in new window


This works but I don't know where to put it in my original query, this is what I have done now but don't know where to put my where clause to filter rn = 1


SELECT        f.customer_ID, e.f_name, e.l_name, e.f_name + ' ' + e.l_name AS full_name_p, NULL AS JobTitle, e.tel_1, e.tel_2, e.mobile_1, e.email_1, e.email_2, 
                         5 AS TypeID, 'Auto added Test' AS Note, GETDATE() AS CreatedDate, 0 AS IsInvoice, NULL AS Photo, 0 AS IsAcc, 0 AS IsPrimary,row_number() OVER(PARTITION BY e.email_1,e.full_name ORDER BY email_1) rn
FROM            dbo.p_companies AS f RIGHT OUTER JOIN
                         dbo.p_contacts AS e ON f.id = e.org_id
WHERE       (f.customer_ID =36) AND (e.f_name NOT LIKE 'Customer') AND (e.email_1 IS NOT NULL) AND (e.email_1 <> '') AND (e.f_name IS NOT NULL) AND (NOT EXISTS
                             (SELECT        contact_ID, customer_ID, contact_first_name, contact_last_name, contact_full_name, contact_job_title, contact_tel1, contact_tel2, contact_mobile, 
                                                         contact_mail, contact_alt_mail, contact_type_ID, contact_note, contact_created_date, contact_is_invoice, contact_photo, contact_is_primary, 
                                                         contact_is_acc
                               FROM            dbo.tblContacts
                               WHERE        (contact_mail = e.email_1) AND (contact_full_name = e.f_name + ' ' + e.l_name)));

Open in new window

0
 

Author Comment

by:databarracks
ID: 39999121
Ok I think I figured it out this is what I did, please let me know if you have a better way, however this is working

SELECT        f.customer_ID, e.f_name, e.l_name, e.f_name + ' ' + e.l_name AS full_name_p, NULL AS JobTitle, e.tel_1, e.tel_2, e.mobile_1, e.email_1, e.email_2, 
                         5 AS TypeID, 'Auto added Test' AS Note, GETDATE() AS CreatedDate, 0 AS IsInvoice, NULL AS Photo, 0 AS IsAcc, 0 AS IsPrimary,e.rn
FROM            dbo.p_companies AS f RIGHT OUTER JOIN
                         (SELECT *,row_number() OVER(PARTITION BY email_1,full_name ORDER BY email_1) rn FROM dbo.p_contacts) AS e ON f.id = e.org_id
WHERE      (rn =1) AND (f.customer_ID =36) AND (e.f_name NOT LIKE 'Customer') AND (e.email_1 IS NOT NULL) AND (e.email_1 <> '') AND (e.f_name IS NOT NULL) AND (NOT EXISTS
                             (SELECT        contact_ID, customer_ID, contact_first_name, contact_last_name, contact_full_name, contact_job_title, contact_tel1, contact_tel2, contact_mobile, 
                                                         contact_mail, contact_alt_mail, contact_type_ID, contact_note, contact_created_date, contact_is_invoice, contact_photo, contact_is_primary, 
                                                         contact_is_acc
                               FROM            dbo.tblContacts
                               WHERE        (contact_mail = e.email_1) AND (contact_full_name = e.f_name + ' ' + e.l_name)));

Open in new window

0
Guide to Performance: Optimization & Monitoring

Nowadays, monitoring is a mixture of tools, systems, and codes—making it a very complex process. And with this complexity, comes variables for failure. Get DZone’s new Guide to Performance to learn how to proactively find these variables and solve them before a disruption occurs.

 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39999229
please read up this article that explains the "issue" in general, and shows the path to success:
http://www.experts-exchange.com/Database/Miscellaneous/A_3203-DISTINCT-vs-GROUP-BY-and-why-does-it-not-work-for-my-query.html
0
 
LVL 27

Expert Comment

by:skullnobrains
ID: 39999543
why don't you use a group by clause ?
0
 

Author Comment

by:databarracks
ID: 39999552
Hi there, could you show me how to do that because my group by clause still bring duplicate values. BEcause my full_name and email_1 field names need to be distinct within the org_id column?

No idea, how to do that?
0
 

Author Comment

by:databarracks
ID: 39999561
Because I need to return all the rows from the p_contacts table so that I can insert them into my tblContacts table. I have tried sub queries the lot but I can't figure this out. GROUP BY won't enable me to return all fields from p_contacts table would it or.....?????
0
 
LVL 27

Assisted Solution

by:skullnobrains
skullnobrains earned 250 total points
ID: 39999579
yes, as far as i know you cannot use a group by inside a subquery (but you should try to make sure)

in that case, obviously an sqlt script would allow what you want but a simple workaround could be to build a view on p_contacts with the required group by clause and use the view in your query
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 40000849
you can use group by in a subquery, but most often it's not the right thing.
my article shows exactly how to do this, by adding a "ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ...) rn  " into the subquery, and adding a RN = 1 into the joining condition.
0
 

Author Comment

by:databarracks
ID: 40000890
I have used the row_number over partition as in my earlier code ID: 39999121 but the problem with that is it won't work with my dataset in visual studio as it says the over function is not supported even though I am using sql server 2012.
0
 
LVL 143

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 250 total points
ID: 40000899
you can always "hide" the sql code complexity from your visual studio code by putting it into a view or a stored procedure (which is something I recommend, anyhow) ...
0
 

Author Comment

by:databarracks
ID: 40001029
Hi there i solved the problem using a view instead. Many thanks for all of your help guys.
0
 

Author Closing Comment

by:databarracks
ID: 40001031
Very good help
0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you find yourself in this situation “I have used SELECT DISTINCT but I’m getting duplicates” then I'm sorry to say you are using the wrong SQL technique as it only does one thing which is: produces whole rows that are unique. If the results you a…
Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
This is Part 3 in a 3-part series on Experts Exchange to discuss error handling in VBA code written for Excel. Part 1 of this series discussed basic error handling code using VBA. http://www.experts-exchange.com/videos/1478/Excel-Error-Handlin…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question