How duplicate duplicate data in ms sql

Posted on 2009-12-22
Last Modified: 2012-05-08
Hello all,

is there a way to delete duplicate data in ms sql ?
Question by:abennett10
    LVL 75

    Expert Comment

    by:Aneesh Retnakaran
    LVL 22

    Accepted Solution

    is there a way to delete duplicate data in ms sql ?
    If the data is absolutely duplicated and there is not some sort of unique identifier available, then answer is "yes, but the process can be ugly."  Otherwise, use the unique identifier to delete the desired rows (although, based on there being a question here, I rather suspect that the former case is the one under discussion. ;-)
    If you have absolutely duplicate rows, i.e. there is nothing to uniquely identify one from the other, then you have a couple of options, neither of which is particularly handy or pretty.
    Option 1: Two Stage Data Copy
    1. Script out the table in question and then (after suitably modifying the table name anywhere it is mentioned in the script) create a second identical (but empty) table into which you can copy data.
    2. Add an Identity column (i.e. an INT with its Identity property set to Yes) to the new table.
    3. Now, script the old table as a SELECT statement and add the word "DISTINCT" between the word SELECT and the first column name.  (I would execute that query to see how many rows it produces and compare it to the number of rows in the table . . . it should be smaller.)
    4. Now script the new table as an INSERT statement.
    5. Next, replace everything in the INSERT script from the word VALUES to the end of the script with the SELECT script from the OLD table.
    6. Now, execute the reconfigured INSERT script.  This should move only distinct rows to the new table.
    7. Once that is done, TRUNCATE the old table.
    8. Then add an Identity column to the OLD table.
    9. Finally, copy the data from the NEW table to the OLD table.

    Option 2: Delete based on Identity and Minumum ID
    1. Add an Identity column (MyID) to the table in question.
    2. Create a query along the lines of:
      DELETE FROM yourtable
      FROM yourtablename
      GROUP BY {list all column names except the identity column});
    3. Execute the query.

    Author Closing Comment

    Thanks for your help. I did a distinct into a temporary table. That fixed it. but I like your solution.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Free Trending Threat Insights Every Day

    Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

    Suggested Solutions

    So every once in a while at work I am asked to export data from one table and insert it into another on a different server.  I hate doing this.  There's so many different tables and data types.  Some column data needs quoted and some doesn't.  What …
    Introduction: When running hybrid database environments, you often need to query some data from a remote db of any type, while being connected to your MS SQL Server database. Problems start when you try to combine that with some "user input" pass…
    Hi everyone! This is Experts Exchange customer support.  This quick video will show you how to change your primary email address.  If you have any questions, then please Write a Comment below!
    Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

    760 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now