InfoTeam
asked on
Delete duplicates from table and leave one record with the latest date_modified
I have a table with lots of duplicates. I need to delete them based on [record id#]. I use the code below for this.
I have a few columns in the table. My previous procedure instead of updating records, appended new records to the table creating duplicates. I have a column [Date Modified] and I need to delete all duplicates (record id should be unique) but leave one record with the latest [Date Modified] (i.e. I have 50 records with the same record id but different [Date Modified]. I need only one record for this Record ID but it should have the latest [Date Modified].
I would really appreciate your advice.
Code to delete records without checking for the latest [Date Modified]
DECLARE @id INT
DECLARE dupsCsr CURSOR READ_ONLY FOR
SELECT [Record Id#], COUNT(*) AS numDups
FROM [infoteam].[Invoice_Recon_ Detail_Ori ginal]
GROUP BY [Record Id#]
HAVING COUNT(*) > 1
DECLARE @numDups INT
OPEN dupsCsr
FETCH NEXT FROM dupsCsr INTO @id, @numDups
WHILE @@FETCH_STATUS = 0
BEGIN
SET @numDups = @numDups - 1
SET ROWCOUNT @numDups
DELETE FROM [infoteam].[Invoice_Recon_ Detail_Ori ginal]
WHERE [Record Id#] = @id
FETCH NEXT FROM dupsCsr INTO @id, @numDups
END
CLOSE dupsCsr
DEALLOCATE dupsCsr
SET ROWCOUNT 0
I have a few columns in the table. My previous procedure instead of updating records, appended new records to the table creating duplicates. I have a column [Date Modified] and I need to delete all duplicates (record id should be unique) but leave one record with the latest [Date Modified] (i.e. I have 50 records with the same record id but different [Date Modified]. I need only one record for this Record ID but it should have the latest [Date Modified].
I would really appreciate your advice.
Code to delete records without checking for the latest [Date Modified]
DECLARE @id INT
DECLARE dupsCsr CURSOR READ_ONLY FOR
SELECT [Record Id#], COUNT(*) AS numDups
FROM [infoteam].[Invoice_Recon_
GROUP BY [Record Id#]
HAVING COUNT(*) > 1
DECLARE @numDups INT
OPEN dupsCsr
FETCH NEXT FROM dupsCsr INTO @id, @numDups
WHILE @@FETCH_STATUS = 0
BEGIN
SET @numDups = @numDups - 1
SET ROWCOUNT @numDups
DELETE FROM [infoteam].[Invoice_Recon_
WHERE [Record Id#] = @id
FETCH NEXT FROM dupsCsr INTO @id, @numDups
END
CLOSE dupsCsr
DEALLOCATE dupsCsr
SET ROWCOUNT 0
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The basic idea is to do a sub-query which returns the latest records, and then delete from the main table all rows which do not join to this sub-query, hence :
delete [all]
from [infoteam].[Invoice_Recon_
left join (
select [Record Id#], max([Date Modified]) [Date Modified]
from [infoteam].[Invoice_Recon_
group by [Record Id#]
) [latest]
on [all].[Record Id#] = [latest].[Record Id#] and [all].[Date Modified] = [latest].[Date Modified]
where [latest].[Record ID#] is NULL