Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Delete duplicates from table and leave one record with the latest date_modified

Posted on 2006-06-16
2
Medium Priority
?
281 Views
Last Modified: 2008-02-01
I have a table with lots of duplicates. I need to delete them based on [record id#]. I use the code below for this.
I have a few columns in the table. My previous procedure instead of updating records, appended new records to the table creating duplicates. I have a column [Date Modified] and I need to delete all duplicates (record id should be unique) but leave one record with the latest [Date Modified] (i.e. I have 50 records with the same record id but different [Date Modified]. I need only one record for this Record ID but it should have the latest [Date Modified].
I would really appreciate your advice.

Code to delete records without checking for the latest [Date Modified]

DECLARE @id INT

DECLARE dupsCsr CURSOR READ_ONLY FOR
SELECT [Record Id#], COUNT(*) AS numDups
FROM [infoteam].[Invoice_Recon_Detail_Original]
GROUP BY [Record Id#]
HAVING COUNT(*) > 1
DECLARE @numDups INT

OPEN dupsCsr
FETCH NEXT FROM dupsCsr INTO @id, @numDups
WHILE @@FETCH_STATUS = 0
BEGIN
     SET @numDups = @numDups - 1
     SET ROWCOUNT @numDups
     DELETE FROM [infoteam].[Invoice_Recon_Detail_Original]
     WHERE [Record Id#] = @id
     FETCH NEXT FROM dupsCsr INTO @id, @numDups
END
CLOSE dupsCsr
DEALLOCATE dupsCsr
SET ROWCOUNT 0
0
Comment
Question by:InfoTeam
2 Comments
 
LVL 1

Expert Comment

by:asduth
ID: 16921797
If there is only one record with the max time-stamp, then you can do the whole thing in a single delete statement, rather then using a cursor :
The basic idea is to do a sub-query which returns the latest records, and then delete from the main table all rows which do not join to this sub-query, hence :


delete [all]
from [infoteam].[Invoice_Recon_Detail_Original] [all]
left join (
      select [Record Id#], max([Date Modified]) [Date Modified]
      from [infoteam].[Invoice_Recon_Detail_Original]
      group by  [Record Id#]
) [latest]
on [all].[Record Id#] = [latest].[Record Id#] and [all].[Date Modified] = [latest].[Date Modified]
where [latest].[Record ID#] is NULL

0
 
LVL 50

Accepted Solution

by:
Lowfatspread earned 2000 total points
ID: 16921805
just this 1 statement needed....

delete from [infoteam].[Invoice_Recon_Detail_Original]
  where exists (select x.[record id#] from [infoteam].[Invoice_Recon_Detail_Original] as x
               where x.[record id#] =[infoteam].[Invoice_Recon_Detail_Original].[record id#]
                  and x.[Date Modified] > [infoteam].[Invoice_Recon_Detail_Original].[date modified]
                        )


ie Delete the row if the same recordid exists with a higher modification date.


0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
When trying to connect from SSMS v17.x to a SQL Server Integration Services 2016 instance or previous version, you get the error “Connecting to the Integration Services service on the computer failed with the following error: 'The specified service …
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

879 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question