SQL Command to delete all duplicates except 1

As you can see from the cmd - i want to delete all duplicates however i want to keep at least one?

e.g. if i have 3 duplicate then i only want to delete 2 etc

NB: Please ignore the declare - its not relevant and i'm unable to edit code snippit
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
	delete FROM 
	[user]
	where email in 
	(
		SELECT email FROM [user]
		GROUP BY email
		HAVING (COUNT(email) > 1 )     
	)

Open in new window

paulCardiffAsked:
Who is Participating?
 
imitchieCommented:
let me rephrase that.
it should now say
1. delete only where the source = @source, never delete existing data
2. handles multiple emails (2xbob, 4xfred, 6xpeter within single @source)
3. ALL @source duplicates are removed if existing email is present
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
DELETE FROM [user]
where source=@source and email in 
(
SELECT email FROM [user]
GROUP BY email
HAVING (COUNT(email) > 1 )     
)
AND NOT [UserID] IN
(
SELECT MIN(UserID) FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)

Open in new window

0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
which of the x rows do you want to keep?

delete u
FROM [user] u
where exists ( SELECT NULL FROM [user] i
            WHERE i.email = u.email
              AND i.somekeyfield < u.somekeyfield 
            )

Open in new window

0
 
imitchieCommented:
try this pattern

delete FROM [user]
where email in
(
SELECT email FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)
AND NOT [ID] IN
(
SELECT MAX(ID) FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
paulCardiffAuthor Commented:
Sorry let me explain by problem in a bit more depth  i'm using the following logic i.e. i'm importing rows and using a source field to identify new rows

so i'll want to keep any existing rows, i thought i could this with the following but i found that this was
just deleteing all duplicate new rows what i want to say is ....

1) Delete all new rows (e.g. @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d') if an existing email address is present
2) If no matching email address is already present then only delete all duplicates new rows except for one - and that one can be any one



declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
	SELECT * FROM 
	[user]
	where source=@source and email in 
	(
		SELECT email FROM [user]
		GROUP BY email
		HAVING (COUNT(email) > 1 )     
	)

Open in new window

0
 
paulCardiffAuthor Commented:
btw: " SELECT * " is meant to be DELETE
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:

delete u
FROM [user] u
WHERE u.source=@source 
AND exists ( SELECT NULL FROM [user] i
            WHERE i.email = u.email
              AND i.somekeyfield < u.somekeyfield 
              AND i.source = @source
            )

Open in new window

0
 
paulCardiffAuthor Commented:
Thanks for that, and please feel free to correct me, but using some sample data i've got e.g.

USERID                              EMAIL                             SOURCE
1                                        a@a.com                        dd59f087-cd6f-412b-baef-5226c7069c0d
2                                        a@a.com                        4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d
3                                        a@a.com                        4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d

This cmd will only delete UserId 3, whereas in this instance i need both 2 & 3 deleted - however if 1 wasn't present then this would be fine
0
 
paulCardiffAuthor Commented:
As the primary key has identify = true one possibility is if i said something like

Delete all duplicates except for the lowest userId

Is this possible and if so can any one suggest the appropiate syntax for it please?
0
 
Racim BOUDJAKDJIDatabase Architect - Dba - Data ScientistCommented:
<<Is this possible and if so can any one suggest the appropiate syntax for it please?>>

delete from yourtable where USERID in (select min(USERID) from yourtable group by EMAIL)
0
 
imitchieCommented:
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
DELETE FROM [user]
where source=@source AND
(
ID < (SELECT MAX(ID) FROM [user] GROUP BY email)
OR NOT EXISTS
(SELECT MAX(ID) FROM [user] GROUP BY email and source <> @source)
)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.