Solved

Find and Mark Duplicate Records in SQL table

Posted on 2011-03-02
4
845 Views
Last Modified: 2012-05-11
Hello,  I am working in SQL Server 2005.  I am trying to mark duplicate records. Let's say I have these records in a table:

  John Bobbit
  John Wayne
  Mark McSteele
  John Bobbit
  Rosario Jake
  Mark McSteele
  Maria Mandrake
  John Willis
  Mark McSteele
  John Bobbit
  John Wayne

First I would need to find which ones contain duplicates.  Then, I'd need to mark them with a number sequence based on the number of duplicates.  For example, in this list
      John Bobbit,
      Mark McSteele, and
      John Wayne have duplicates.  
John Bobbit has 3, Mark McSteele also has 3, but John Wayne has 2.

I want to update a field in this table so they get marked like this:


1  John Bobbit
1  John Wayne
1  Mark McSteele
2  John Bobbit
1  Rosario Jake
2  Mark McSteele
1  Maria Mandrake
1  John Willis
3  Mark McSteele
3  John Bobbit
2  John Wayne

As you can see there's a 1, 2, and 3 mark for both John Bobbit and Mark McSteele; 1 and 2 for John Wayne; but the rest are marked as 1 because they don't have duplicates.

Any ideas?

Thanks!



0
Comment
Question by:TheUndecider
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 

Author Comment

by:TheUndecider
ID: 35022789
I'd also would like to point out all of these records have an unique ID that could be used to update the mark field.  
0
 
LVL 41

Expert Comment

by:Sharath
ID: 35022820
try this.
select row_number() over (partition by your_column order by your_column) as seq_num,
       your_column
  from your_table

Open in new window

0
 
LVL 41

Accepted Solution

by:
Sharath earned 500 total points
ID: 35022848
To have unique id, you can try this. You can use this query in your UPDATE statement.
select row_number() over (order by your_column) as unique_id,
       row_number() over (partition by your_column order by your_column) as seq_num,
       your_column
  from your_table

Open in new window

0
 

Author Closing Comment

by:TheUndecider
ID: 35028692
Thanks for your answer.  This is exactly what I was looking for.
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question