Solved

Delete duplicate rows from table SQL 2008

Posted on 2010-09-01
7
757 Views
Last Modified: 2012-06-22
SQL 2008

I have a query that deletes duplicate row form SQL table. The query looks like this:

with CTE as (
      select *, row_number() over (partition by Time_stamp order by Time_stamp) rn
      from yourtable
)
delete from CTe where rn > 1;

Query works fine. I just need to expand it's function however. I need to be able to scroll thru all tables in the database and find duplicate rows (leaving just one behind). So I am thinking  to use a cursor.

Each table has a id colum called "Date_Stamp". Duplicate Rows are those that have matching id (keys), regardless of the reset of data in other columns.

How do I achieve that?
0
Comment
Question by:techzone12
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 28

Expert Comment

by:sammySeltzer
ID: 33579473
But this will do the job though:

DELETE FROM mytable WHERE yourid IN
    (SELECT yourid
    FROM mytable
    GROUP BY yourid
    HAVING COUNT(*) > 1)
0
 

Author Comment

by:techzone12
ID: 33579577
Yes. It looks actually simpler.

Now I need to scroll thru all tables in the database and delete duplicate rows. How do I do this?

Thanks
0
 
LVL 41

Accepted Solution

by:
ralmada earned 500 total points
ID: 33579696
>>DELETE FROM mytable WHERE yourid IN
    (SELECT yourid
    FROM mytable
    GROUP BY yourid
    HAVING COUNT(*) > 1) <<
Wrong, that will delete all rows where there are duplicates, including the one you want to leave behind!!
You can try the below if you want to avoid a cursor
 

exec sp_MSforeachtable '
if exists (select 1 from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME= ''?'' and COLUMN_NAME = ''Date_Stamp'')
	with CTE as (
	      select *, row_number() over (partition by Time_stamp order by Time_stamp) rn
	      from [?]
	)
	delete from CTe where rn > 1'

Open in new window

0
Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33584205
Just a correction to the code by ralmada

exec sp_MSforeachtable '
if exists (select * from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME= ''?'' and COLUMN_NAME = ''Date_Stamp'')
begin
;with CTE as (
            select *, row_number() over (partition by Date_Stamp order by Date_Stamp) rn
            from [?]
      )
      delete from CTe where rn > 1
end'

Partition should be by date_stamp or whatever name your real column is, and with CTE requires a leading ;
0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33584209
Further to the above, the ; is optional, but it must be a new statement block, hence the begin - end.
0
 

Author Comment

by:techzone12
ID: 33585287
Thanks for the feedback.  I will try the modified query according to your suggetion.
0
 
LVL 41

Expert Comment

by:ralmada
ID: 33587343
Yeah, the column name should be date_stamp. I just copied and pasted your query without checking the column name. So where it reads "Time_stamp" it should read "Date_Stamp"
0

Featured Post

Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This is basically a blog post I wrote recently. I've found that SARGability is poorly understood, and since many people don't read blogs, I figured I'd post it here as an article. SARGable is an adjective in SQL that means that an item can be fou…
In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…

790 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question