Link to home
Start Free TrialLog in
Avatar of coperations07
coperations07Flag for United States of America

asked on

Need to remove duplicated records that do have unique id

I put in an archive function a few days ago that had a join problem. I ended up with some duplicated records that I need to purge out. There is a unique ID field and the CreateDt is different for each record that is duplicated. There should be just one record for each sorter,wave,order,chute combo. The attached file shows 38 rows of sample data. There should only be 3 rows. What is the best way to save the earliest record and delete the others that were duplicated?
OrderData.xlsx
Avatar of Jan Louwerens
Jan Louwerens
Flag of United States of America image

This should give you the IDs you want to delete.

SELECT ID FROM
(
   SELECT ID, ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID) AS Idx FROM <table_name>
WHERE
   Idx > 1

Open in new window

SOLUTION
Avatar of Dustin Saunders
Dustin Saunders
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of coperations07

ASKER

Thanks guys! I will give this a try.
I've tried

    SELECT  ID, ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID) AS Idx FROM db_sort00.dbo.tbl_arch_seq_header AS Idx
WHERE
   Idx > 1

It is telling me Idx is an invalid column. I've tried a few different things to tweak it but haven't got it to work for me yet.
You have 2 things named "AS Idx".

SELECT  ID, ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID) AS Idx FROM db_sort00.dbo.tbl_arch_seq_header
WHERE
   Idx > 1

Open in new window


Looking back, it looks like the typo originated in my previous comment. Sorry about that.
I've tried again today with the edit, but Idx is still not recognized as a column.

    SELECT  ID, (ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID)) AS Idx 
    FROM db_sort00.dbo.tbl_arch_seq_header
	WHERE Idx > 1

Open in new window


I was able to get this to return results. I haven't deleted anything yet though:

;WITH CTE
            AS (
    SELECT ID,
                Row_number()OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY (SELECT 1) ) AS Rn
        FROM   db_sort00.dbo.tbl_arch_seq_header)
       SELECT *
    FROM   CTE
    WHERE  Rn > 1

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I accepted my own solution because it is what ended up working for me.