asked on

Need to remove duplicated records that do have unique id

I put in an archive function a few days ago that had a join problem. I ended up with some duplicated records that I need to purge out. There is a unique ID field and the CreateDt is different for each record that is duplicated. There should be just one record for each sorter,wave,order,chute combo. The attached file shows 38 rows of sample data. There should only be 3 rows. What is the best way to save the earliest record and delete the others that were duplicated?
OrderData.xlsx

Jan Louwerens

This should give you the IDs you want to delete.

SELECT ID FROM
(
   SELECT ID, ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID) AS Idx FROM <table_name>
WHERE
   Idx > 1

Open in new window

SOLUTION

Dustin Saunders

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

coperations07

ASKER

Thanks guys! I will give this a try.

coperations07

ASKER

I've tried

SELECT ID, ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID) AS Idx FROM db_sort00.dbo.tbl_arch_seq_header AS Idx
WHERE
Idx > 1

It is telling me Idx is an invalid column. I've tried a few different things to tweak it but haven't got it to work for me yet.

Jan Louwerens

You have 2 things named "AS Idx".

SELECT  ID, ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID) AS Idx FROM db_sort00.dbo.tbl_arch_seq_header
WHERE
   Idx > 1

Open in new window

Looking back, it looks like the typo originated in my previous comment. Sorry about that.

coperations07

ASKER

I've tried again today with the edit, but Idx is still not recognized as a column.

    SELECT  ID, (ROW_NUMBER() OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY ID)) AS Idx 
    FROM db_sort00.dbo.tbl_arch_seq_header
	WHERE Idx > 1

Open in new window

I was able to get this to return results. I haven't deleted anything yet though:

;WITH CTE
            AS (
    SELECT ID,
                Row_number()OVER (PARTITION BY SorterID, WaveID, OrderID, chute ORDER BY (SELECT 1) ) AS Rn
        FROM   db_sort00.dbo.tbl_arch_seq_header)
       SELECT *
    FROM   CTE
    WHERE  Rn > 1

Open in new window

SOLUTION

Jan Louwerens

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER CERTIFIED SOLUTION

coperations07

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.