daffeyduck
asked on
Eliminate duplicate records? Identity column used.
I have a large database that was setup with a record_id that was suppose to eliminate duplicates. However, when they received new updates someone didn't do it correctly and now there are duplicates with the same name and address, etc fields and different record_id's. How would be the easiest way to remove the full duplicate records keeping the higher record_id record from the duplicates?
This was what someone had said to do, was create a new database with the below query and then delete the old database and rename the new one and recreate the index's.
INSERT INTO Sample_BackUp
SELECT *
FROM Sample
WHERE record_id NOT IN ( SELECT MAX( record_id )
FROM [Sample]
GROUP BY [Sample].name,[Sample].add ress )
The other issue is the record_id column is an Identity column with Identity Seed 1, and Identity Increment 1. If I delete the duplicate records will all the records change there record_id numbers. All of the records that I keep in the new dedupped database need to retain there original record_id number.
Please any assistance would be greatly appreciated. Looking for the quickest and easiest way.
Thank you.....
daffeyduck
This was what someone had said to do, was create a new database with the below query and then delete the old database and rename the new one and recreate the index's.
INSERT INTO Sample_BackUp
SELECT *
FROM Sample
WHERE record_id NOT IN ( SELECT MAX( record_id )
FROM [Sample]
GROUP BY [Sample].name,[Sample].add
The other issue is the record_id column is an Identity column with Identity Seed 1, and Identity Increment 1. If I delete the duplicate records will all the records change there record_id numbers. All of the records that I keep in the new dedupped database need to retain there original record_id number.
Please any assistance would be greatly appreciated. Looking for the quickest and easiest way.
Thank you.....
daffeyduck
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The simplest way to eliminate the duplicate records is to SELECT DISTINCT into a temporary table, truncate the original table and SELECT the records back into the original table. That query looks like this:
SELECT DISTINCT * INTO Sample_Backup
FROM Sample
TRUNCATE TABLE Sample
INSERT SAMPLE
SELECT * FROM Sample_Backup
DROP TABLE Sample_Backup
SELECT DISTINCT * INTO Sample_Backup
FROM Sample
TRUNCATE TABLE Sample
INSERT SAMPLE
SELECT * FROM Sample_Backup
DROP TABLE Sample_Backup
Hi,
d3lph1:
As I understand the question, this wont do what is wanted as the duplicate records have different ID's.
My guess is that you need to delete the duplicate iwth the HIGHEST ID as this is the last record inserted, while the other record may have linked records. It is unlikely that the last copy of the duplicate will have many linked records.
Steps
1. Identify duplicated rows
2. Search on the duplicate to see if linked rows exist.
3. If linked rows exist for more than one duplicate, then update the links to point to only one of the duplicates
4. Delete the other row.
5. Check for orphans.
I'd try this on a backup copy of your database to check and tune the code before doing it live.
Can you post the database schema so I can have a crack at writing the SQL.
Regards
David
d3lph1:
As I understand the question, this wont do what is wanted as the duplicate records have different ID's.
My guess is that you need to delete the duplicate iwth the HIGHEST ID as this is the last record inserted, while the other record may have linked records. It is unlikely that the last copy of the duplicate will have many linked records.
Steps
1. Identify duplicated rows
2. Search on the duplicate to see if linked rows exist.
3. If linked rows exist for more than one duplicate, then update the links to point to only one of the duplicates
4. Delete the other row.
5. Check for orphans.
I'd try this on a backup copy of your database to check and tune the code before doing it live.
Can you post the database schema so I can have a crack at writing the SQL.
Regards
David
ASKER
Hi David,
Thank you so very much for offering your assistance, I was able to eliminate the dups, last night, late.
Thanks again David......
daffeyduck
Thank you so very much for offering your assistance, I was able to eliminate the dups, last night, late.
Thanks again David......
daffeyduck
ASKER
Thank you very much, it worked great....
I really appreciate your assistance...
daffeyduck
I really appreciate your assistance...
daffeyduck
ASKER