How duplicate duplicate data in ms sql

Hello all,

is there a way to delete duplicate data in ms sql ?
abennett10Asked:
Who is Participating?
 
8080_DiverConnect With a Mentor Commented:
is there a way to delete duplicate data in ms sql ?
If the data is absolutely duplicated and there is not some sort of unique identifier available, then answer is "yes, but the process can be ugly."  Otherwise, use the unique identifier to delete the desired rows (although, based on there being a question here, I rather suspect that the former case is the one under discussion. ;-)
If you have absolutely duplicate rows, i.e. there is nothing to uniquely identify one from the other, then you have a couple of options, neither of which is particularly handy or pretty.
Option 1: Two Stage Data Copy
  1. Script out the table in question and then (after suitably modifying the table name anywhere it is mentioned in the script) create a second identical (but empty) table into which you can copy data.
  2. Add an Identity column (i.e. an INT with its Identity property set to Yes) to the new table.
  3. Now, script the old table as a SELECT statement and add the word "DISTINCT" between the word SELECT and the first column name.  (I would execute that query to see how many rows it produces and compare it to the number of rows in the table . . . it should be smaller.)
  4. Now script the new table as an INSERT statement.
  5. Next, replace everything in the INSERT script from the word VALUES to the end of the script with the SELECT script from the OLD table.
  6. Now, execute the reconfigured INSERT script.  This should move only distinct rows to the new table.
  7. Once that is done, TRUNCATE the old table.
  8. Then add an Identity column to the OLD table.
  9. Finally, copy the data from the NEW table to the OLD table.

Option 2: Delete based on Identity and Minumum ID
  1. Add an Identity column (MyID) to the table in question.
  2. Create a query along the lines of:
    DELETE FROM yourtable
    WHERE  MyID NOT IN
    (SELECT MIN(MyID)
    FROM yourtablename
    GROUP BY {list all column names except the identity column});
  3. Execute the query.
0
 
Aneesh RetnakaranDatabase AdministratorCommented:
0
 
abennett10Author Commented:
Thanks for your help. I did a distinct into a temporary table. That fixed it. but I like your solution.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.