Solved

Find and Mark Duplicate Records in SQL table

Posted on 2011-03-02
4
840 Views
Last Modified: 2012-05-11
Hello,  I am working in SQL Server 2005.  I am trying to mark duplicate records. Let's say I have these records in a table:

  John Bobbit
  John Wayne
  Mark McSteele
  John Bobbit
  Rosario Jake
  Mark McSteele
  Maria Mandrake
  John Willis
  Mark McSteele
  John Bobbit
  John Wayne

First I would need to find which ones contain duplicates.  Then, I'd need to mark them with a number sequence based on the number of duplicates.  For example, in this list
      John Bobbit,
      Mark McSteele, and
      John Wayne have duplicates.  
John Bobbit has 3, Mark McSteele also has 3, but John Wayne has 2.

I want to update a field in this table so they get marked like this:


1  John Bobbit
1  John Wayne
1  Mark McSteele
2  John Bobbit
1  Rosario Jake
2  Mark McSteele
1  Maria Mandrake
1  John Willis
3  Mark McSteele
3  John Bobbit
2  John Wayne

As you can see there's a 1, 2, and 3 mark for both John Bobbit and Mark McSteele; 1 and 2 for John Wayne; but the rest are marked as 1 because they don't have duplicates.

Any ideas?

Thanks!



0
Comment
Question by:TheUndecider
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 

Author Comment

by:TheUndecider
ID: 35022789
I'd also would like to point out all of these records have an unique ID that could be used to update the mark field.  
0
 
LVL 41

Expert Comment

by:Sharath
ID: 35022820
try this.
select row_number() over (partition by your_column order by your_column) as seq_num,
       your_column
  from your_table

Open in new window

0
 
LVL 41

Accepted Solution

by:
Sharath earned 500 total points
ID: 35022848
To have unique id, you can try this. You can use this query in your UPDATE statement.
select row_number() over (order by your_column) as unique_id,
       row_number() over (partition by your_column order by your_column) as seq_num,
       your_column
  from your_table

Open in new window

0
 

Author Closing Comment

by:TheUndecider
ID: 35028692
Thanks for your answer.  This is exactly what I was looking for.
0

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
I have a large data set and a SSIS package. How can I load this file in multi threading?
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question