Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Find and Mark Duplicate Records in SQL table

Posted on 2011-03-02
4
Medium Priority
?
855 Views
Last Modified: 2012-05-11
Hello,  I am working in SQL Server 2005.  I am trying to mark duplicate records. Let's say I have these records in a table:

  John Bobbit
  John Wayne
  Mark McSteele
  John Bobbit
  Rosario Jake
  Mark McSteele
  Maria Mandrake
  John Willis
  Mark McSteele
  John Bobbit
  John Wayne

First I would need to find which ones contain duplicates.  Then, I'd need to mark them with a number sequence based on the number of duplicates.  For example, in this list
      John Bobbit,
      Mark McSteele, and
      John Wayne have duplicates.  
John Bobbit has 3, Mark McSteele also has 3, but John Wayne has 2.

I want to update a field in this table so they get marked like this:


1  John Bobbit
1  John Wayne
1  Mark McSteele
2  John Bobbit
1  Rosario Jake
2  Mark McSteele
1  Maria Mandrake
1  John Willis
3  Mark McSteele
3  John Bobbit
2  John Wayne

As you can see there's a 1, 2, and 3 mark for both John Bobbit and Mark McSteele; 1 and 2 for John Wayne; but the rest are marked as 1 because they don't have duplicates.

Any ideas?

Thanks!



0
Comment
Question by:TheUndecider
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 

Author Comment

by:TheUndecider
ID: 35022789
I'd also would like to point out all of these records have an unique ID that could be used to update the mark field.  
0
 
LVL 41

Expert Comment

by:Sharath
ID: 35022820
try this.
select row_number() over (partition by your_column order by your_column) as seq_num,
       your_column
  from your_table

Open in new window

0
 
LVL 41

Accepted Solution

by:
Sharath earned 2000 total points
ID: 35022848
To have unique id, you can try this. You can use this query in your UPDATE statement.
select row_number() over (order by your_column) as unique_id,
       row_number() over (partition by your_column order by your_column) as seq_num,
       your_column
  from your_table

Open in new window

0
 

Author Closing Comment

by:TheUndecider
ID: 35028692
Thanks for your answer.  This is exactly what I was looking for.
0

Featured Post

Will your db performance match your db growth?

In Percona’s white paper “Performance at Scale: Keeping Your Database on Its Toes,” we take a high-level approach to what you need to think about when planning for database scalability.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Viewers will learn how the fundamental information of how to create a table.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

670 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question