Solved

SQL Query find and replace duplicates with a "D" added to primary key

Posted on 2014-04-11
11
257 Views
Last Modified: 2014-04-11
I wanted to know a way to find duplicates in a query and replace them with a case statement that adds a "D" or anything to note that there are 2 instances of the primary key.

Basically I want to show all records from table CTE, but I want it to replace any records that have 2 instances of the same name and birth date with a "D" in the contactID field. Where would I add a case when statement?

;with cte as (
      select      C.Contact_ID, C.First_Name, C.Last_Name, C.Date_Of_Birth, /* C.other_fields... , */
                  O.Group_Name, /* O.other_fields... , */
                  P.End_Date
      from      pm.Contacts C
      inner      join pm.Patients P
      on            P.Patient_ID = C.Contact_ID
      inner      join pm.Policies O
      on            O.Contact_ID = C.Contact_ID
      where      P.End_Date is null
)
select      *
from      cte T1
where      exists (
                  -- check for duplicates i.e. a record with
                  -- different ID and same name and birth date
                  select      *
                  from      cte T2
                  where      T2.First_Name = T1.First_Name
                  and            T2.Last_Name = T1.Last_Name
                  and            T2.Date_Of_Birth = T1.Date_Of_Birth
                  and            T2.Contact_ID != T1.Contact_ID
            )
0
Comment
Question by:db_hopewell
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +2
11 Comments
 
LVL 11

Accepted Solution

by:
John_Vidmar earned 500 total points
ID: 39994764
Looks like you are using T-SQL (MS SQL Server), which contains row_number function, allows you to number the duplicates.  Partitioning tells row_number to restart the numbering at 1 when there is a grouping change:
with cte as
(	SELECT	C.Contact_ID
	,	C.First_Name
	,	C.Last_Name
	,	C.Date_Of_Birth
	,	/* other fields you need */
	,	rn = row_number() OVER (PARTITION BY First_Name, Last_Name, Date_Of_Birth ORDER BY Contact_ID)
	FROM	pm.Contacts	C
	JOIN	pm.Patients	P	on	P.Patient_ID = C.Contact_ID
	JOIN	pm.Policies	O	on	O.Contact_ID = C.Contact_ID
	WHERE	P.End_Date IS NULL
)
SELECT	Contact_ID = CAST(Contact_ID as varchar) + CASE WHEN rn = 1 THEN '' else 'D' END
,	/* other fields you need */
FROM      cte

Open in new window

0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39994774
Which database?

I assume the Primary key hasn't been created yet or there would not be duplicates?

What if there are 3 duplicates?  Adding a 'D' would still create a duplicate.

That said look at the following example.  Col1 will be the 'new' PK once the dupes are removed.

Assuming Oracle
drop table tab1 purge;
create table tab1 (col1 varchar2(5), Col2 char(1));

insert into tab1 values('1','A');
insert into tab1 values('2','B');
insert into tab1 values('1','C');
insert into tab1 values('3','D');
insert into tab1 values('3','E');
commit;


update tab1 set col1 = col1 || 'D' where rowid in
(
	select max(rowid) from tab1 group by col1 having count(*) > 1
);

Open in new window

0
 
LVL 34

Expert Comment

by:ste5an
ID: 39994777
that there are 2 instances of the primary key.
This is not possible. Or it is not a primary key..
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 

Author Comment

by:db_hopewell
ID: 39994824
ok let me rephrase. I wrote a complex query and want to find the duplicates and mark them and add a "D" on the end in the ID field.  Is there a different way than  John_Vidmar's row number and partitioning within sql above? Something like a case statement?  The system that I am sending the file to, would create 2 records of the same person with different ID's. One with the 'id' and one with the 'id' + 'D'
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39994827
row_number is probably how I would check.  It is a common function in many database products these days.

You can easily take the row_number output and add it to a case statement to add the 'D'.  You should then be able to make that an update statement.

Knowing the database you are targeting will help us provide working examples.
0
 

Author Comment

by:db_hopewell
ID: 39994832
Microsoft SQL Server and I am going to run this in integration services to produce an output txt file for another proprietary system to read in nightly.
0
 
LVL 34

Expert Comment

by:ste5an
ID: 39994848
Why not using DISTINCT to eliminate them?
0
 
LVL 11

Expert Comment

by:John_Vidmar
ID: 39994853
If you don't have row_number function then you could use a correlated-subquery:
with cte as
(	SELECT	C.Contact_ID
	,	/* other fields you need */
	,	rn =	(	SELECT	count(*)
				FROM	Contacts
				WHERE	First_Name = C.First_Name
				AND	Last_Name = C.Last_Name
				AND	Date_Of_Birth = C.Date_Of_Birth
				AND	Contact_ID < = C.Contact_ID
			)
	FROM	pm.Contacts	C
	JOIN	pm.Patients	P	on	P.Patient_ID = C.Contact_ID
	JOIN	pm.Policies	O	on	O.Contact_ID = C.Contact_ID
	WHERE	P.End_Date IS NULL
)
SELECT	Contact_ID = CAST(Contact_ID as varchar) + CASE WHEN rn = 1 THEN '' else 'D' END
,	/* other fields you need */
FROM      cte
                                                                             

Open in new window

0
 

Author Closing Comment

by:db_hopewell
ID: 39994869
Thanks for being patient with me. That will work for now.
0
 
LVL 32

Expert Comment

by:awking00
ID: 39994916
Assuming that the Contact_ID is the primary key (and unique) and that you want to concatenate a 'D' at the end of it if there are matching names and birthdates in other records, the following should work:
with cte as
(select first_name fname, last_name lname, date_of_birth dob, count(*) cnt
 from pm.contacts
 group by first_name, last_name, date_of_birth)
select  case when cte.cnt = 1 then C.Contact_ID
             else c.Contact_ID||'D'
        end as Contact_ID,
 C.First_Name, C.Last_Name, C.Date_Of_Birth, /* C.other_fields... , */
                  O.Group_Name, /* O.other_fields... , */
                  P.End_Date
      from      pm.Contacts C
      inner      join pm.Patients P
      on            P.Patient_ID = C.Contact_ID
      inner      join pm.Policies O
      on            O.Contact_ID = C.Contact_ID
      inner      join cte on
                    c.first_name = cte.fname and
                    c.last_name = cte.lname and
                    c.date_of_birth = cte.dob
      where      P.End_Date is null
The above is for Oracle but changing the concatenation operator from "||" to "+" should work for SQL Server.
0

Featured Post

Enroll in May's Course of the Month

May’s Course of the Month is now available! Experts Exchange’s Premium Members and Team Accounts have access to a complimentary course each month as part of their membership—an extra way to increase training and boost professional development.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Can we attach PDF to table 2 41
ms sql and asp dates 5 39
EF5 How do I stop pre-compiled views? 8 46
Creating Scalar Function 3 14
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how the fundamental information of how to create a table.

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question