Solved

# Remove the Duplicate Data

Posted on 2005-05-11
Medium Priority
290 Views
Dear Friend

Can any one please tell me how can i remove the duplicates data and i want only 1 unique records .

i have the table having these structure

Create table xyz(srno numeric,sname varchar(12),age numeric)

1      manish      26
1      manish      26
1      manish      26
2      Rohan      27
2      Rohan      27
3      Riyaz      28
3      Riyaz      28
3      Riyaz      28
4      shabbir      29
4      shabbir      29
4      shabbir      29
4      shabbir      29

Thank & Regards
Manish Kaushik
0
Question by:manishkaushik

LVL 1

Accepted Solution

worldserga earned 100 total points
ID: 13983830
Hello,

1) alter table xyz

2) DELETE FROM a
FROM  xyz a JOIN (SELECT     srno, sname, age, MAX(seq_num) max_seq_num FROM xyz GROUP BY srno, sname, age
HAVING COUNT(*) > 1) b ON a.srno = b.srno AND a.sname = b.sname AND a.age = b.age AND a.seq_num < b.max_seq_num

3) alter table xyz
drop column seq_num

This will definitely solve your problem.

Regards.
0

LVL 8

Assisted Solution

Julianva earned 100 total points
ID: 13984016
run this scripts - Please test it first before you run it on live db

Declare @srno int,
@sname varchar (50),
@cnt int,
@age numeric

Declare getallrecords cursor local static For
Select count (1), srno, sname, age
from xyz (nolock)
group by srno, sname,age having count(1)>1

Open getallrecords

Fetch next from getallrecords into @cnt,@srno,@sname,@age
--Cursor to check with all other records
While @@fetch_status=0
Begin
Set @cnt= @cnt-1
Set rowcount @cnt

-- Deleting the duplicate records. Observe that all fields are mentioned at the where condition
Delete from employee where srno=@srno and sname=@sname
and age=@age
Set rowcount 0

Fetch next from getallrecords into @cnt,@srno,@sname,@age

End

Close getallrecords
Deallocate getallrecords
0

LVL 17

Expert Comment

ID: 13984396
either of the above methods will work OK, a simpler solution may be to simply extract out the unique records, and reinsert them - it really depends on how many duplicates you have. If you only have a few, the most efficient way will be Julianva's, but if you have a lot of dupliactes, all the deletes will be time-consuming.

select into #temp_xyz distinct srno, sname, age from xyz
truncate table xyz
insert into xyz select * from #temp_xyz
0

LVL 1

Expert Comment

ID: 13984540
Hi everybody,

I cannot agree with BillAn1.

A SET based solution is almost always faster and efficient than an iterative or loop based solution.
0

LVL 17

Expert Comment

ID: 13986095
worldserga, I appologise, I did not mean to point to Julianva's answer over yours, I should have said
"...most efficient way will be Julianva's or Worldserga's" (the merits / demerits of the 2 approaches again depends on the data. It is often found that an approach like Julianva's is preferred, since it is ittertative, and does not have such an impact on other processes in a live environment. Also, the overhead of adding an extra column (an identity) and then dropping it again, calculating the max etc can be quite resource consuming, so it will still depend heavily on the data which approach is more efficient )

What I really wanted to do was to offer an alternative approach from deleting the offending rows at all, no matter how you did it, and instead to reload the table with the correct rows, which is often the more efficient solution - again depending on the data.
0

Author Comment

ID: 14008079
Dear all Friends

Thank you to all of your for spending time for me.

i got various concepts of doing the task in my one of the huge ( live ) database having 22457892 records.

Thank u so much.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have a large data set and a SSIS package. How can I load this file in multi threading?
In the first part of this tutorial we will cover the prerequisites for installing SQL Server vNext on Linux.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
###### Suggested Courses
Course of the Month16 days, 6 hours left to enroll