Remove the Duplicate Data

Posted on 2005-05-11
Last Modified: 2010-03-19
Dear Friend

Can any one please tell me how can i remove the duplicates data and i want only 1 unique records .

i have the table having these structure

Create table xyz(srno numeric,sname varchar(12),age numeric)

1      manish      26
1      manish      26
1      manish      26
2      Rohan      27
2      Rohan      27
3      Riyaz      28
3      Riyaz      28
3      Riyaz      28
4      shabbir      29
4      shabbir      29
4      shabbir      29
4      shabbir      29

Thank & Regards
Manish Kaushik
Question by:manishkaushik
    LVL 1

    Accepted Solution


    Follow these three steps:

    1) alter table xyz
         add seq_num int identity

    2) DELETE FROM a
        FROM  xyz a JOIN (SELECT     srno, sname, age, MAX(seq_num) max_seq_num FROM xyz GROUP BY srno, sname, age
        HAVING COUNT(*) > 1) b ON a.srno = b.srno AND a.sname = b.sname AND a.age = b.age AND a.seq_num < b.max_seq_num

    3) alter table xyz
         drop column seq_num

    This will definitely solve your problem.

    LVL 8

    Assisted Solution

    run this scripts - Please test it first before you run it on live db

    Declare @srno int,
            @sname varchar (50),
            @cnt int,
            @age numeric

    Declare getallrecords cursor local static For
     Select count (1), srno, sname, age
       from xyz (nolock)
       group by srno, sname,age having count(1)>1
    Open getallrecords

    Fetch next from getallrecords into @cnt,@srno,@sname,@age
    --Cursor to check with all other records
    While @@fetch_status=0
      Set @cnt= @cnt-1
      Set rowcount @cnt

      -- Deleting the duplicate records. Observe that all fields are mentioned at the where condition
      Delete from employee where srno=@srno and sname=@sname
      and age=@age
      Set rowcount 0

      Fetch next from getallrecords into @cnt,@srno,@sname,@age


    Close getallrecords
    Deallocate getallrecords
    LVL 17

    Expert Comment

    either of the above methods will work OK, a simpler solution may be to simply extract out the unique records, and reinsert them - it really depends on how many duplicates you have. If you only have a few, the most efficient way will be Julianva's, but if you have a lot of dupliactes, all the deletes will be time-consuming.
    instead, try the following :

    select into #temp_xyz distinct srno, sname, age from xyz
    truncate table xyz
    insert into xyz select * from #temp_xyz
    LVL 1

    Expert Comment

    Hi everybody,

    I cannot agree with BillAn1.

    A SET based solution is almost always faster and efficient than an iterative or loop based solution.
    LVL 17

    Expert Comment

    worldserga, I appologise, I did not mean to point to Julianva's answer over yours, I should have said
    "...most efficient way will be Julianva's or Worldserga's" (the merits / demerits of the 2 approaches again depends on the data. It is often found that an approach like Julianva's is preferred, since it is ittertative, and does not have such an impact on other processes in a live environment. Also, the overhead of adding an extra column (an identity) and then dropping it again, calculating the max etc can be quite resource consuming, so it will still depend heavily on the data which approach is more efficient )

    What I really wanted to do was to offer an alternative approach from deleting the offending rows at all, no matter how you did it, and instead to reload the table with the correct rows, which is often the more efficient solution - again depending on the data.

    Author Comment

    Dear all Friends

    Thank you to all of your for spending time for me.

    i got various concepts of doing the task in my one of the huge ( live ) database having 22457892 records.

    Thank u so much.

    Featured Post

    PRTG Network Monitor: Intuitive Network Monitoring

    Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

    Join & Write a Comment

    I recently came across an interesting Question In EE ( and was puzzled about how to achieve that using SSIS out of the box tasks, which was i…
    International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
    Via a live example, show how to shrink a transaction log file down to a reasonable size.
    Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

    728 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    20 Experts available now in Live!

    Get 1:1 Help Now