Oracle script need to remove specific duplicate entries from a table.

Hello expert,
This is bit urgent.
I have table  claims_A with about 10 columns.
There about  2.6 million records in the table.
I have just inserted about 5000 records in the table and just realized there were 18 duplicate records in my insert statement.
so now I have 18 duplicates records  ClaimsA
What I urgently need to do is just delete these  specific 18 duplicate records that I added.
Is there a fast script that can do it?
Mind you there could be many other duplicate records (other than what I added) but I only want to remove the duplicate that I added.
Is there quick way to do this?
Thank you.
al
ALad2005Asked:
Who is Participating?
 
Mark GeerlingsConnect With a Mentor Database AdministratorCommented:
Here is a generic query to identify the duplicate records:

select t1.key_column, t1.rowid
from claims_table t1
where exists (select t2.key_column
from claims_table t2
where t2.key_column = t1.key_column
  and t2.rowid < t1.rowid);

Open in new window


You need to change "key_column" and "claims_table" to your actual column and table names.  And, if you have a multi-column key, you will need to include all of the key columns.

And here is a generic delete script:
delete from from claims_table t1
where exists (select t2.key_column
from claims_table t2
where t2.key_column = t1.key_column
  and t2.rowid < t1.rowid);

Open in new window

0
 
Mark GeerlingsDatabase AdministratorCommented:
How do you know they are duplicates?  This is: which column (or columns) need to be compared to identify the duplicates?  Also, is there a "date_entered" column or a similar column that can be used to identify the newer of each pair of duplicate records?  (If not, we can use the "rowid".)

Can you give us the actual table and column names?  Or, do you want us to just give you a generic SQL query that you will have to modify with your actual table and column names?

To prevent duplicates from being entered into this table in the future, you could (I say should) create a primary or unique key index that includes the key column(s).
0
 
Mark GeerlingsDatabase AdministratorCommented:
Whether either this "select..." query or the "delete..." statement will be fast or not, depends on whether your "key_column(s)" is/are indexed, or not.  It/they certainly should be indexed!  But, apparently there is no unique index on it/them now.
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

 
slightwv (䄆 Netminder)Connect With a Mentor Commented:
Since you know the 18 rows and you want to leave any other duplictes, I would take the data from the 18 rows in question and make 18 delete statements out of them:

delete from your_table where rownum=1 and col1='A' and col2='B';

Use enough of the data to uniquely identify the 18 duplicate rows.  

You should end up with 18 delete statements and watch when you execute them, each should ONLY delete one row.  If it deletes 0 or more than 1, rollback the delete since it is wrong.
0
 
ALad2005Author Commented:
Thank you Mark.
Your suggested script worked but was taking time, so I had to include slightwv's suggestion
Thank you for your prompt response
Appreicated.
0
 
Mark GeerlingsDatabase AdministratorCommented:
"...worked but was taking time".  That indicates a lack of an index on the key column(s) in your table.  That is usually not a good plan.  Also, the fact that the duplicates were allowed to be inserted, clearly indicates the lack of a unique index.  That is also usually not a good idea with Oracle tables.

To prevent this problem from happening in the future, you should create a unique index on this table.  Is there something that prevents you from creating an index like this?
0
 
slightwv (䄆 Netminder) Commented:
I believe the main reason Mark's method took a long time is because it was going after ALL duplicates in the table not just the 18 you wanted.  Yes, lack of indexes is the root cause but it also wouldn't have given you what you wanted even if indexes existed.
0
 
ALad2005Author Commented:
Mark/slightwv
  Agree there is problem with the way the table was designed.  Since I am not the owner of the table, I will pass your suggestion about putting index and not allow duplicates  that will make things easier in future.   Thank you so much for your suggestion which solved the issue at hand,

will your help in future too!
Cheers
Al
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.