Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Find and Delete Duplicate Records

Posted on 2014-02-16
12
Medium Priority
?
443 Views
Last Modified: 2014-02-16
I have a table called group_members:

group_membership_id int PK
contact_id int
group_id int

I have a problem where there has been duplication in the data where I have the following

group_membership_id            contact_id              group_id
1                                               100                         1000
2                                               100                         1001
3                                               101                         1000
4                                               101                         1001
5                                               100                         1000
6                                               101                         1000

Notice how rows 5 and 6 are dupes of 1 and 2.

I need to write a query to find all of these dupe and delete them.
0
Comment
Question by:ccleebelt
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
12 Comments
 
LVL 11

Expert Comment

by:Gregory Miller
ID: 39863391
Maybe I am missing the obvious but how exactly are lines 5&6 duplicates of 1&2?

With this little bit of detail, you are going to have a difficult time finding unique ways to express what a duplicate is.
0
 

Author Comment

by:ccleebelt
ID: 39863394
line 1 and 2 have the same contact_id, group_id pairs as 5 and 6.

Ideally, we find a way to delete 5 and 6 based on them being the same as 1 and 2.
0
 
LVL 11

Expert Comment

by:Gregory Miller
ID: 39863403
Ahhh... 1&3 not 1&2
So is it correct to say that the combination of the two columns should be unique? In other words, there will never be more than one entry with contact_id=100 AND group_id=1000 ?

How are you connecting to the SQL Server? Do you have MS Access available?
How many columns doe this table have? Just the three?
How many rows does the table have?

Backup your database before you do anything...
0
Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

 

Author Comment

by:ccleebelt
ID: 39863407
Oops.  Yeah, nice catch.  

There should never be more than one row with any contact_id, group_id pair.

I connect through Management Studio

These are the only columns

About 300000 rows in the table.  Of which about 20% are dupes a such as 5 and 6 above.
0
 
LVL 11

Expert Comment

by:Gregory Miller
ID: 39863413
Try this:
SELECT DISTINCT 'group_membership_id', 'contact_id', 'group_id' INTO <new_table> FROM  <your_table>

Open in new window


This will create a new table <new_table> (call it whatever you want) which will contain only the first occurrence of the duplicate entries. You can review this for correctness. If this looks good, rename the existing table (do not delete it yet) and then rename the new table to the original table name.

Be aware that the column group_membership_id may end up wrong for some records because you cannot know in what order the duplicates fall. If you are lucky the duplicates were created all at the end of the table, in which case you will be all good. You may have to doctor some of this column manually...
0
 
LVL 11

Expert Comment

by:Gregory Miller
ID: 39863414
You can also use this shorter version since there are only the three columns:
SELECT DISTINCT * INTO <new_table> FROM  <your_table>

Open in new window

0
 

Author Comment

by:ccleebelt
ID: 39863416
I ran this and it creates a exact replica of the original table with the exact same duplicates.
0
 
LVL 11

Accepted Solution

by:
Gregory Miller earned 2000 total points
ID: 39863418
My bad... Remove the group_membership_id field...
SELECT DISTINCT 'contact_id', 'group_id' INTO <new_table> FROM  <your_table>

Open in new window


I wrote that before your last response...
0
 

Author Comment

by:ccleebelt
ID: 39863427
worked.  Thanks!
0
 
LVL 11

Expert Comment

by:Gregory Miller
ID: 39863431
Shazaaaam!
0
 

Author Comment

by:ccleebelt
ID: 39863445
This is interesting though....I need to get the group_membership_id column on the new table and then make it the PK.

SQL will not allow me to do that.

Any pointers?
0
 
LVL 11

Expert Comment

by:Gregory Miller
ID: 39863587
You may have to remove the PK from the temp table (original that was renamed) first since the key cannot be named the same.
0

Featured Post

Survive A High-Traffic Event with Percona

Your application or website rely on your database to deliver information about products and services to your customers. You can’t afford to have your database lose performance, lose availability or become unresponsive – even for just a few minutes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to shrink a transaction log file down to a reasonable size.

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question