Solved

deduplicate SQL 2005 table on one field

Posted on 2014-01-06
2
392 Views
Last Modified: 2014-01-06
hello all! I have a table with 3 fields, let's just call them field1, email, and field3. So I have duplicate email addresses throughout the table. I want to deduplicate this table based on email. I don't care what records are removed as long as the table is trimmed down to non-duplicate email addresses. Is there a way I can do this? Thanks!
0
Comment
Question by:mrosier
2 Comments
 
LVL 59

Accepted Solution

by:
Kevin Cross earned 500 total points
ID: 39759617
Yes, there are several ways to accomplish this.  One way, leveraging the ROW_NUMBER() and OVER() clause in SQL 2005, is to rank your records 1 to n within a derived query or common table expression, then DELETE records with rank > 1 in outer query.

Using this method, you can use the ordering of the other pieces of information to determine the retention.  For example, if field1 is an integer that increments, you may want the lowest ID value (i.e., the first entry).  Therefore, you could ORDER BY field1.  If you want the last, you could ORDER BY field1 DESC.  If it truly does not matter, you can make the ORDER BY random.

For example:
;WITH cte(field1, email, field3, [rank]) AS (
    SELECT field1, email, field3
         , ROW_NUMBER() OVER(PARTITION BY email ORDER BY field1)
    FROM your_table_name
)
DELETE
FROM cte
WHERE [rank] > 1
;

Open in new window


I hope that helps!

P.S. I would run the inner query first, so you understand what the ranking does.  Second, you can change "DELETE" to "SELECT field1, email, field3, [rank]" so you can see what records will be removed before issuing the deletion.

Best regards,

Kevin
0
 

Author Comment

by:mrosier
ID: 39759667
excellent, it really is truly a non-issue as to what records stay as long as email addresses in the table are unique. And that is how it usually is with data deduplication for me. I appreciate the help, thanks!
0

Featured Post

Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

by Mark Wills Attending one of Rob Farley's seminars the other day, I heard the phrase "The Accidental DBA" and fell in love with it. It got me thinking about the plight of the newcomer to SQL Server...  So if you are the accidental DBA, or, simp…
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
Migrating to Microsoft Office 365 is becoming increasingly popular for organizations both large and small. If you have made the leap to Microsoft’s cloud platform, you know that you will need to create a corporate email signature for your Office 365…
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now