VIVEKANANDHAN_PERIASAMY
asked on
removing duplicate rows in the table
How to delete duplicate rows in the table?
Or perhaps this:
DELETE FROM tableA
WHERE id IN
(SELECT id
FROM tableA
GROUP BY id
HAVING COUNT(*) > 1)
DELETE FROM tableA
WHERE id IN
(SELECT id
FROM tableA
GROUP BY id
HAVING COUNT(*) > 1)
ASKER
nope, if i dont have a primary key for id and if i insert two record,then how to delete it?
like eg:
1 vivek
2 sammy
1 vivek
1 vivek
now i want a result like
1 vivek
2 sammy
removing 2 latest entry made for vivek
like eg:
1 vivek
2 sammy
1 vivek
1 vivek
now i want a result like
1 vivek
2 sammy
removing 2 latest entry made for vivek
You don't have to have a primary key.
Whatever ID you want to use that shows
1 vivek
2 sammy
1 vivek
1 vivek
is what you want to use.
In the example you showed, what is 1,2,1,1.
That seems like an ID to me.
So, use it in the second code I posted above.
Whatever ID you want to use that shows
1 vivek
2 sammy
1 vivek
1 vivek
is what you want to use.
In the example you showed, what is 1,2,1,1.
That seems like an ID to me.
So, use it in the second code I posted above.
ASKER
assume these are value for ID ,not assume it as primary key or unique. Id column may have even duplicate value.In this case how to use?
Ok, let's try this. Replace OriginalTable with your table name.
If Replace ID if different from yours. You can make additional changes where necessary.
references:
http://www.sql-server-performance.com/2003/delete-duplicates/
If Replace ID if different from yours. You can make additional changes where necessary.
--Identify and save dup data into temp table
INSERT INTO #tempduplicate
SELECT * FROM OriginalTable
GROUP BY ID
HAVING COUNT(*) > 1
--Confirm number of dup rows
SELECT @@ROWCOUNT AS 'Number of Duplicate Rows'
--Delete dup from original table
DELETE FROM OriginalTable
FROM OriginalTable
INNER JOIN #tempduplicate
ON OriginalTable
.ID = #tempduplicate.ID
--Insert the delete data back
INSERT INTO OriginalTable
SELECT * FROM #tempduplicate
--Check for dup data.
SELECT * FROM OriginalTable
GROUP BY ID
HAVING COUNT(*) > 1
--Check table
SELECT * FROM OriginalTable
--Drop temp table
DROP TABLE #tempduplicate
GO
references:
http://www.sql-server-performance.com/2003/delete-duplicates/
Another option, in case you are having problem with above is:
Step1:
--get unique records and insert them into temp table
SELECT DISTINCT * INTO #temp FROM OriginalDuplicateTable GROUP BY ID, other fields HAVING COUNT(ID) > 1
Setp2: Delete all rows from the original table that also reside in the
DELETE OriginalDuplicateTable WHERE ID IN (SELECT ID FROM #temp)
Step 3 : Move back the rows from temp table to original table.
INSERT INTO OriginalDuplicateTable SELECT * FROM #temp
Then test
Step1:
--get unique records and insert them into temp table
SELECT DISTINCT * INTO #temp FROM OriginalDuplicateTable GROUP BY ID, other fields HAVING COUNT(ID) > 1
Setp2: Delete all rows from the original table that also reside in the
DELETE OriginalDuplicateTable WHERE ID IN (SELECT ID FROM #temp)
Step 3 : Move back the rows from temp table to original table.
INSERT INTO OriginalDuplicateTable SELECT * FROM #temp
Then test
sammySeltzer was very close with his CTE. Here's some test code using your sample. The trick is that you need to name EVERY column in the partition by clause and ANY column in the Order By clause.
declare @tab table (COL1 int, COL2 varchar(100))
insert @tab select 1, 'vivek'
insert @tab select 2, 'sammy'
insert @tab select 1, 'vivek'
insert @tab select 1, 'vivek'
select * from @tab
;with cte
as
(select *, row_number() over(partition by COL1, COL2 order by COL1) as rownum
from @tab)
delete from cte where rownum > 1
select * from @tab
declare @tab table (COL1 int, COL2 varchar(100))
insert @tab select 1, 'vivek'
insert @tab select 2, 'sammy'
insert @tab select 1, 'vivek'
insert @tab select 1, 'vivek'
select * from @tab
;with cte
as
(select *, row_number() over(partition by COL1, COL2 order by COL1) as rownum
from @tab)
delete from cte where rownum > 1
select * from @tab
dqmq,
I really do hope the author does not miss the irony. You even gave them the title of the question: "how do I get rid of duplicates?"
:)
sammySeltzer,
>>SELECT DISTINCT * INTO #temp FROM OriginalDuplicateTable GROUP BY ID, other fields HAVING COUNT(ID) > 1<<
Just a pet peeve, but at the very least DISTINCT is redundant, at worse all columns have to be in the GROUP BY clause.
I really do hope the author does not miss the irony. You even gave them the title of the question: "how do I get rid of duplicates?"
:)
sammySeltzer,
>>SELECT DISTINCT * INTO #temp FROM OriginalDuplicateTable GROUP BY ID, other fields HAVING COUNT(ID) > 1<<
Just a pet peeve, but at the very least DISTINCT is redundant, at worse all columns have to be in the GROUP BY clause.
LOL, I can see the HAVING(ID) >1 is misplaced.
I didn't intend to do that.
I panicked when my boss walked into my cube and pushed the submit button without reviewing my work.
I am amused about the DISTINCT though.
I didn't intend to do that.
I panicked when my boss walked into my cube and pushed the submit button without reviewing my work.
I am amused about the DISTINCT though.
<<How to delete duplicate rows in the table?>>
What table ?
What table ?
Please have a look in this link, there are 3 methods how to delete duplicate rows... you can use any of them!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
;with CTE
as
(
select ID, ROW_NUMBER() over (partition by Val order by ID) as RowNum
from @T
)
delete from CTE
where RowNum > 1