?
Solved

SQL Command to delete all duplicates except 1

Posted on 2007-11-27
10
Medium Priority
?
1,318 Views
Last Modified: 2008-02-01
As you can see from the cmd - i want to delete all duplicates however i want to keep at least one?

e.g. if i have 3 duplicate then i only want to delete 2 etc

NB: Please ignore the declare - its not relevant and i'm unable to edit code snippit
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
	delete FROM 
	[user]
	where email in 
	(
		SELECT email FROM [user]
		GROUP BY email
		HAVING (COUNT(email) > 1 )     
	)

Open in new window

0
Comment
Question by:paulCardiff
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 20361337
which of the x rows do you want to keep?

delete u
FROM [user] u
where exists ( SELECT NULL FROM [user] i
            WHERE i.email = u.email
              AND i.somekeyfield < u.somekeyfield 
            )

Open in new window

0
 
LVL 25

Expert Comment

by:imitchie
ID: 20361424
try this pattern

delete FROM [user]
where email in
(
SELECT email FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)
AND NOT [ID] IN
(
SELECT MAX(ID) FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)
0
 

Author Comment

by:paulCardiff
ID: 20361431
Sorry let me explain by problem in a bit more depth  i'm using the following logic i.e. i'm importing rows and using a source field to identify new rows

so i'll want to keep any existing rows, i thought i could this with the following but i found that this was
just deleteing all duplicate new rows what i want to say is ....

1) Delete all new rows (e.g. @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d') if an existing email address is present
2) If no matching email address is already present then only delete all duplicates new rows except for one - and that one can be any one



declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
	SELECT * FROM 
	[user]
	where source=@source and email in 
	(
		SELECT email FROM [user]
		GROUP BY email
		HAVING (COUNT(email) > 1 )     
	)

Open in new window

0
U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

 

Author Comment

by:paulCardiff
ID: 20361451
btw: " SELECT * " is meant to be DELETE
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 20361480

delete u
FROM [user] u
WHERE u.source=@source 
AND exists ( SELECT NULL FROM [user] i
            WHERE i.email = u.email
              AND i.somekeyfield < u.somekeyfield 
              AND i.source = @source
            )

Open in new window

0
 

Author Comment

by:paulCardiff
ID: 20361548
Thanks for that, and please feel free to correct me, but using some sample data i've got e.g.

USERID                              EMAIL                             SOURCE
1                                        a@a.com                        dd59f087-cd6f-412b-baef-5226c7069c0d
2                                        a@a.com                        4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d
3                                        a@a.com                        4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d

This cmd will only delete UserId 3, whereas in this instance i need both 2 & 3 deleted - however if 1 wasn't present then this would be fine
0
 

Author Comment

by:paulCardiff
ID: 20361841
As the primary key has identify = true one possibility is if i said something like

Delete all duplicates except for the lowest userId

Is this possible and if so can any one suggest the appropiate syntax for it please?
0
 
LVL 23

Expert Comment

by:Racim BOUDJAKDJI
ID: 20361941
<<Is this possible and if so can any one suggest the appropiate syntax for it please?>>

delete from yourtable where USERID in (select min(USERID) from yourtable group by EMAIL)
0
 
LVL 25

Expert Comment

by:imitchie
ID: 20361942
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
DELETE FROM [user]
where source=@source AND
(
ID < (SELECT MAX(ID) FROM [user] GROUP BY email)
OR NOT EXISTS
(SELECT MAX(ID) FROM [user] GROUP BY email and source <> @source)
)
0
 
LVL 25

Accepted Solution

by:
imitchie earned 1500 total points
ID: 20361964
let me rephrase that.
it should now say
1. delete only where the source = @source, never delete existing data
2. handles multiple emails (2xbob, 4xfred, 6xpeter within single @source)
3. ALL @source duplicates are removed if existing email is present
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
DELETE FROM [user]
where source=@source and email in 
(
SELECT email FROM [user]
GROUP BY email
HAVING (COUNT(email) > 1 )     
)
AND NOT [UserID] IN
(
SELECT MIN(UserID) FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)

Open in new window

0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.
Suggested Courses

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question