How to copy Rows with duplicate IDs to a separate table and then delete them from the current table.

Posted on 2014-08-05
Last Modified: 2014-08-06
What SQL Statement would find duplicate SID rows, copy the entire row of data to a separate table, then delete them from the current table?

9767 Rows
105 =  count (SID) > 1  

Sample Data:

        SID                RD           TN   TU  DOS  ODB      IDB
      2400200        6/28/2012      1      0      16      22              0
      2400200        7/12/2012      1      0      30      22              0
      2400202        7/12/2012      1      71      30      0              0
      2400206        7/12/2012      1      0      30      1037      0
      2400208        7/12/2012      1      52      30      0              0
      2400210        7/12/2012      1      66      30      0              0
Question by:H2Omike
    LVL 31

    Expert Comment

    I think it might be best to do this in code.  Set up a recordset, sorted by SID, and go record by record, using code (something like the code sample below, though you would need to modify it) to determine whether the current SID is a duplicate.  If it is a duplicate, add a record to the other table, and delete the current record.  
       Set rstSource = CurrentDb.OpenRecordset(strQuery, dbOpenDynaset)
       Set rstTarget = CurrentDb.OpenRecordset(strTable, dbOpenDynaset)
       strPrevLastName = ""
       lngLastNameCount = 1
       With rstSource
           Do While Not .EOF
             strThisLastName = ![LastName]
             strFullName = ![FullName]
             Debug.Print "Current LastName: " & strThisLastName
             Debug.Print "Current FullName: " & strFullName
             If strThisLastName <> strPrevLastName Then
                'New LastName
                Debug.Print "On new LastName record"
             ElseIf strThisLastName = strPrevLastName Then
               'Same LastName
                Debug.Print "On another record for same last name"
                With rstTarget
                   ![FullName] = strFullName
                   lngOutputID = ![OutputID]
                End With
                'Delete current record
              End If
             lngLastNameCount = lngLastNameCount + 1
             strPrevLastName = strThisLastName
       End With
       Set rstSource = Nothing
       Set rstTarget = Nothing

    Open in new window


    Author Comment

    I found a way, not sure it's the most efficient- three separate statements:

    INTO T1_1206_NUM_DUPS
    FROM T1_1206
    HAVING ( COUNT(SID) > 1 )

    SELECT T1_1206.*
    INTO T1_1206_DUPS
    FROM T1_1206_NUM_DUPS
    Left JOIN T1_1206
    ON T1_1206_NUM_DUPS.SID=T1_1206.SID

    DELETE FROM T1_1206
          (SELECT T1_1206_NUM_DUPS.SID
           FROM T1_1206_NUM_DUPS
          WHERE T1_1206_NUM_DUPS.SID = T1_1206.SID)
    LVL 68

    Expert Comment

    You're deleting all duplicates.  Is it possible you want to retain one of them in the original table?

    Also, for the DELETE, I'd check that the row exists in the export table,  T1_1206_DUPS, rather than the keys-only table, T1_1206_NUM_DUPS, to insure that I had a full copy of the data before it was deleted.
    LVL 24

    Accepted Solution

    You can combine first two queries into one:
    SELECT T1_1206.*
    INTO T1_1206_DUPS
    FROM T1_1206
    FROM T1_1206
    HAVING ( COUNT(SID) > 1 ) T1_1206_NUM_DUPS
    ON T1_1206_NUM_DUPS.SID=T1_1206.SID

    Open in new window

    Now, when you are deleting the rows you would want to retain at least one of the rows that was duplicate. Looking at the sample data you have provided I can see that for duplicate entries you have different RD dates. So, assuming that the record with the latest RD is the one that needs to be retained modify your delete statement to this:
    DELETE FROM T1_1206
          (SELECT T1_1206_NUM_DUPS.SID
           FROM T1_1206_NUM_DUPS
          WHERE T1_1206_NUM_DUPS.SID = T1_1206.SID)
    AND  T1_1206.RD <> 
          WHERE T1_1206_NUM_DUPS.SID = T1_1206.SID)

    Open in new window

    LVL 33

    Expert Comment

    As others have already mentioned, you probably want to keep at least one of the duplicates.  The problem is determining which one to keep.  And the second problem is keeping duplicates from reoccurring.  A simple way to eliminate duplicates as long as you don't care which of the "duplicate" records survive is to rename the existing table.  Create a new table using the original name and copying just the schema but no data.  Then modify the new version to have a unique index on SID (or make it the primary key).  Then run an append query.  At the end, you'll get a message telling you that some records weren't added.  Just say OK continue and the unique values will be added and the duplicates will be discarded.

    Using the above method also solves the second problem since your new table now has a unique index to prevent a recurrence.

    Warning - DANGER!  DANGER!  Whenever you do something like this, your first step will always be to make one or more backups of the database being modified.  In this particular case, your second step MUST be to turn off Name Auto Correct or you will be really unhappy with the results.

    Featured Post

    IT, Stop Being Called Into Every Meeting

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Join & Write a Comment

    A simple tool to export all objects of two Access files as text and compare it with Meld, a free diff tool.
    The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
    What’s inside an Access Desktop Database. Will look at the basic interface, Navigation Pane (Database Container), Tables, Queries, Forms, Report, Macro’s, and VBA code.
    Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

    733 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    17 Experts available now in Live!

    Get 1:1 Help Now