Solved

SQL Command to delete all duplicates except 1

Posted on 2007-11-27
10
1,279 Views
Last Modified: 2008-02-01
As you can see from the cmd - i want to delete all duplicates however i want to keep at least one?

e.g. if i have 3 duplicate then i only want to delete 2 etc

NB: Please ignore the declare - its not relevant and i'm unable to edit code snippit
declare @source UniqueIdentifier

select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 

	delete FROM 

	[user]

	where email in 

	(

		SELECT email FROM [user]

		GROUP BY email

		HAVING (COUNT(email) > 1 )     

	)

Open in new window

0
Comment
Question by:paulCardiff
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 20361337
which of the x rows do you want to keep?

delete u
FROM [user] u
where exists ( SELECT NULL FROM [user] i
            WHERE i.email = u.email
              AND i.somekeyfield < u.somekeyfield 
            )

Open in new window

0
 
LVL 25

Expert Comment

by:imitchie
ID: 20361424
try this pattern

delete FROM [user]
where email in
(
SELECT email FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)
AND NOT [ID] IN
(
SELECT MAX(ID) FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    
)
0
 

Author Comment

by:paulCardiff
ID: 20361431
Sorry let me explain by problem in a bit more depth  i'm using the following logic i.e. i'm importing rows and using a source field to identify new rows

so i'll want to keep any existing rows, i thought i could this with the following but i found that this was
just deleteing all duplicate new rows what i want to say is ....

1) Delete all new rows (e.g. @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d') if an existing email address is present
2) If no matching email address is already present then only delete all duplicates new rows except for one - and that one can be any one



declare @source UniqueIdentifier

select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 

	SELECT * FROM 

	[user]

	where source=@source and email in 

	(

		SELECT email FROM [user]

		GROUP BY email

		HAVING (COUNT(email) > 1 )     

	)

Open in new window

0
 

Author Comment

by:paulCardiff
ID: 20361451
btw: " SELECT * " is meant to be DELETE
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 20361480

delete u

FROM [user] u

WHERE u.source=@source 

AND exists ( SELECT NULL FROM [user] i

            WHERE i.email = u.email

              AND i.somekeyfield < u.somekeyfield 

              AND i.source = @source

            )

Open in new window

0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:paulCardiff
ID: 20361548
Thanks for that, and please feel free to correct me, but using some sample data i've got e.g.

USERID                              EMAIL                             SOURCE
1                                        a@a.com                        dd59f087-cd6f-412b-baef-5226c7069c0d
2                                        a@a.com                        4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d
3                                        a@a.com                        4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d

This cmd will only delete UserId 3, whereas in this instance i need both 2 & 3 deleted - however if 1 wasn't present then this would be fine
0
 

Author Comment

by:paulCardiff
ID: 20361841
As the primary key has identify = true one possibility is if i said something like

Delete all duplicates except for the lowest userId

Is this possible and if so can any one suggest the appropiate syntax for it please?
0
 
LVL 23

Expert Comment

by:Racim BOUDJAKDJI
ID: 20361941
<<Is this possible and if so can any one suggest the appropiate syntax for it please?>>

delete from yourtable where USERID in (select min(USERID) from yourtable group by EMAIL)
0
 
LVL 25

Expert Comment

by:imitchie
ID: 20361942
declare @source UniqueIdentifier
select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 
DELETE FROM [user]
where source=@source AND
(
ID < (SELECT MAX(ID) FROM [user] GROUP BY email)
OR NOT EXISTS
(SELECT MAX(ID) FROM [user] GROUP BY email and source <> @source)
)
0
 
LVL 25

Accepted Solution

by:
imitchie earned 500 total points
ID: 20361964
let me rephrase that.
it should now say
1. delete only where the source = @source, never delete existing data
2. handles multiple emails (2xbob, 4xfred, 6xpeter within single @source)
3. ALL @source duplicates are removed if existing email is present
declare @source UniqueIdentifier

select @source='4fb2a2a4-db60-460b-bf5c-7ddf2872fe1d'
 

DELETE FROM [user]

where source=@source and email in 

(

SELECT email FROM [user]

GROUP BY email

HAVING (COUNT(email) > 1 )     

)

AND NOT [UserID] IN

(

SELECT MIN(UserID) FROM [user] GROUP BY email HAVING (COUNT(email) > 1 )    

)

Open in new window

0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Viewers will learn how the fundamental information of how to create a table.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now