Solved

Remove all duplicates except the first in a SQL data set

Posted on 2014-08-10
Hello

I have a SQL Script that returns a set of Data, I need to remove the duplicates in a Certain column.

When there is a Duplicate value in the MinQty Column I need to remove all the Duplicate values and only Keep the FIRST of the duplicate records.

Following is a Screenshot of some Dummy Data.

Here is what Results I Need.

Many thanks
Question by:p-plater

LVL 10

Assisted Solution

HuaMinChen earned 500 total points
ID: 40252502
Hi,
Assume tab1 is the table name, try

``````with cte as
(select code,amount,minqty,ROW_NUMBER() OVER(ORDER BY code) rn from tab1)
select code,amount,minqty
from cte
where rn=1
``````
LVL 48

Expert Comment

ID: 40252527
test by selecting first, adjust number of of deletions if large.

``````SELECT
minQty
, MIN(Code)
FROM Sample
GROUP BY
minQty
HAVING COUNT(*) > 1
;

DELETE TOP (10000) FROM S --<< adjust to suit
FROM Sample S
INNER JOIN (
SELECT
minQty
, MIN(Code) AS min_code
FROM Sample
GROUP BY
minQty
HAVING COUNT(*) > 1
) AS dups
ON s.minQty = dups.minQty
AND s.code > dups.min_code -->> IMPORTANT! use just greater than
;

see: http://sqlfiddle.com/#!3/33ac9/3
``````
Accepted Solution

p-plater earned 0 total points
ID: 40252543
Edited HuaMinChen's Solution to Work Correctly

with cte as
(select code,amount,minqty,ROW_NUMBER() OVER(PARTITION BY MINQTY ORDER BY AMOUNT) rn from tab1)
select code,amount,minqty
from cte
where rn=1
LVL 48

Expert Comment

ID: 40252566
apparently you didn't try mine (which was complete)
LVL 69

Expert Comment

ID: 40253234
I don't object to the closing request.  I just want to make sure the solution is correct.

You said you wanted the "first" of duplicates kept.
Do you want the minimum "Amount" value when there are duplicate "MinQty" values?
Or do you want the lowest-numbered "Code" value?

As written, the code gives you the smallest Amount value, but you may want the lowest Code value instead.
Author Comment

ID: 40254623
I see what happened now - When I make the sample data in Excel I dragged the code and it incremented it by 1 each row. :(
(The code was supposed to be the same for each record)

I Need the Smallest Amount PER Code and Min Qty.

Sorry for the Confusion

Thanks Everyone
Author Closing Comment

ID: 40262637
Had to Edit to work correctly
LVL 48

Expert Comment

ID: 40262714
Some observations:

If the edits arise from information not made available to the expert, should the expert be graded down?

Having a need to edit the proposed solution isn't reason enough to downgrade the result particularly as it is the technique employing row_number() and "rn=1" that is the essence of that answer.

Is that edit "substantial"?
``````with cte as
(select code,amount,minqty,ROW_NUMBER() OVER(ORDER BY code) rn from tab1)
select code,amount,minqty
from cte
where rn=1

with cte as
(select code,amount,minqty,ROW_NUMBER() OVER(PARTITION BY MINQTY ORDER BY AMOUNT) rn from tab1)
select code,amount,minqty
from cte
where rn=1
``````
At least part of that change is due to the mistake in the supplied data (PARTITION BY MINQTY) isn't it?
