We help IT Professionals succeed at work.

removing duplicate rows in the table

VIVEKANANDHAN_PERIASAMY
VIVEKANANDHAN_PERIASAMY used Ask the Experts™
on
How to delete duplicate rows in the table?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2011

Commented:
can you try this using cte:

;with CTE
as
(
      select ID, ROW_NUMBER() over (partition by Val order by ID) as RowNum
      from @T
)
delete from CTE
where RowNum > 1
Top Expert 2011

Commented:
Or perhaps this:

DELETE FROM tableA
WHERE id IN
    (SELECT id
    FROM tableA
    GROUP BY id
    HAVING COUNT(*) > 1)
nope, if i dont have a primary key for id and if i insert two record,then how to delete it?

like eg:

1 vivek
2 sammy
1 vivek
1 vivek

now i want a result like

1 vivek
2 sammy

removing 2 latest entry made for vivek
Top Expert 2011

Commented:
You don't have to have a primary key.

Whatever ID you want to use that shows

1 vivek
2 sammy
1 vivek
1 vivek

is what you want to use.

In the example you showed, what is 1,2,1,1.

That seems like an ID to me.

So, use it in the second code I posted above.
assume these are value for ID ,not assume it as primary key or unique. Id column may have even duplicate value.In this case how to use?
Top Expert 2011

Commented:
Ok, let's try this. Replace OriginalTable with your table name.

If Replace ID if different from yours. You can make additional changes where necessary.

--Identify and save dup data into temp table
INSERT INTO #tempduplicate
SELECT * FROM OriginalTable
GROUP BY ID
HAVING COUNT(*) > 1
--Confirm number of dup rows
SELECT @@ROWCOUNT AS 'Number of Duplicate Rows'
--Delete dup from original table
DELETE FROM OriginalTable
FROM OriginalTable
INNER JOIN #tempduplicate
ON  OriginalTable
.ID = #tempduplicate.ID
--Insert the delete data back
INSERT INTO OriginalTable
SELECT * FROM #tempduplicate
--Check for dup data.
SELECT * FROM OriginalTable
GROUP BY ID
HAVING COUNT(*) > 1
--Check table
SELECT * FROM OriginalTable
--Drop temp table
DROP TABLE #tempduplicate
GO

Open in new window


references:
http://www.sql-server-performance.com/2003/delete-duplicates/

Top Expert 2011

Commented:
Another option, in case you are having problem with above is:

Step1:

--get unique records and insert them into temp table
SELECT DISTINCT * INTO #temp FROM OriginalDuplicateTable GROUP BY ID, other fields HAVING COUNT(ID) > 1

Setp2: Delete all rows from the original table that also reside in the
DELETE OriginalDuplicateTable WHERE ID IN (SELECT ID FROM #temp)

Step 3 : Move back the rows from temp table to original table.

INSERT INTO OriginalDuplicateTable SELECT * FROM #temp

Then test





Commented:
@VIVEKANAND…

And to think, only yesterday you asked

I had a good laugh...

Commented:
sammySeltzer was very close with his CTE.  Here's some test code using your sample.  The trick is that you need to name EVERY column in the partition by clause and ANY column in the Order By clause.





declare @tab table (COL1 int, COL2 varchar(100))
insert @tab select 1, 'vivek'
insert @tab select 2, 'sammy'
insert @tab select 1, 'vivek'
insert @tab select 1, 'vivek'

select * from @tab

;with cte
as
(select *, row_number() over(partition by COL1, COL2 order by COL1) as rownum
 from @tab)
delete from cte where rownum > 1

select * from @tab
Top Expert 2012

Commented:
dqmq,

I really do hope the author does not miss the irony.  You even gave them the title of the question: "how do I get rid of duplicates?"

:)

sammySeltzer,

>>SELECT DISTINCT * INTO #temp FROM OriginalDuplicateTable GROUP BY ID, other fields HAVING COUNT(ID) > 1<<
Just a pet peeve, but at the very least DISTINCT is redundant, at worse all columns have to be in the GROUP BY clause.
Top Expert 2011

Commented:
LOL, I can see the HAVING(ID) >1 is misplaced.

I didn't intend to do that.

I panicked when my boss walked into my cube and pushed the submit button without reviewing my work.

I am amused about the DISTINCT though.
Racim BOUDJAKDJIDatabase Architect - Dba - Data Scientist

Commented:
<<How to delete duplicate rows in the table?>>
What table ?

Commented:
Please have a look in this link, there are 3 methods how to delete duplicate rows... you can use any of them!
Commented:
Drop table test
Go
Create table test (
ID int,
[Name] nvarchar(50) )
Go
Insert into test values (1,'test1')
Insert into test values (1,'test1')
Insert into test values (2,'test3')

GO
Select * from test
Go
WITH CTE as(
SELECT ROW_NUMBER() OVER(PARTITION BY [Name] ORDER BY (SELECT 1)) AS RowID,
*
FROM test
)
Delete
FROM CTE
WHERE RowID > 1
GO
Select * from test
Go