Solved

SQL query to detect and delete duplicate records

Posted on 2004-08-05
5
4,594 Views
Last Modified: 2012-08-13
I need a SQL query to detect and delete duplicate records, that is records where firstname and lastname are identical (if the case is different, would still be duplicate).  Detection and deletion could be in separate steps.
Thanks
0
Comment
Question by:MichaelMullin
5 Comments
 
LVL 50

Accepted Solution

by:
Lowfatspread earned 125 total points
ID: 11730240
you need the Primary key of the row as well

select 'matched to ', t.pk, d.*
from  Table as D
inner Join Table as T
on D.Pk < T.PK
and D.Firstname=T.FirstName
and D.Lastname=T.lastName

you need criteria to decide which one to delete  

Delete from Table
Where Exists (Select T.pk from table as T
                      Where T.pk > Table.pk
                         and T.firstname=Table.firstname
                        and t.lastname = table.lastname)


I hope this isn't HOMEWORK ?    
0
 
LVL 17

Expert Comment

by:BillAn1
ID: 11730281
If no primary key, try something like this :

SELECT DISTINCT firstname, lastname
INTO #temp_table
FROM source_table

DELETE FROM source_table

INSERT INTO source_table
SELECT * FROM #temp_table

0
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 11730356
do you have a case sensistivity problem ?

if so convert both names to upper case and then do the test....

0
 
LVL 34

Expert Comment

by:arbert
ID: 11730487
Agree with lowfat--if you have something you can use for a key, you're better off to use that method.  If not, you need to use a method like BillAn1 suggested (I would just truncate the table instead of deleting the old rows--also, if you have a lot of data, this can be very slow) or a cursor.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 11730947
Here is a sample using a cursor but that does not require a separate table or a dump/reload:



DECLARE dupsCsr CURSOR READ_ONLY FOR
SELECT [firstName], [lastName], COUNT(*) AS numDups
FROM yourTable
GROUP BY [firstName], [lastName]
HAVING COUNT(*) > 1
DECLARE @firstName VARCHAR(30) --Change to match datatype on your table
DECLARE @lastName VARCHAR(30) --Change to match datatype on your table
DECLARE @numDups INT

OPEN dupsCsr
FETCH NEXT FROM dupsCsr INTO @firstName, @lastName, @numDups
WHILE @@FETCH_STATUS = 0
BEGIN
      SET @numDups = @numDups - 1 --delete all but 1 of the duplicates
      SET ROWCOUNT @numDups
      DELETE FROM yourTable
      WHERE [firstName] = @firstName AND [lastName] = @lastName
      FETCH NEXT FROM dupsCsr INTO @firstName, @lastName, @numDups
END --WHILE
CLOSE dupsCsr
DEALLOCATE dupsCsr

SET ROWCOUNT 0 --restore default
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now