Deleting huge rows from Oracle 11gR2.

Swadhin Ray
Swadhin Ray used Ask the Experts™
on
Hello Experts,

I have a table that contains 546471 rows out of which I need to delete 386201 rows based on date column.

Now I tried two ways where it is taking huge time to remove the rows.


First the table has a primary key c_key1  and c_dt is a date column based on which I need to delete.

The delete statement is :

DELETE FROM MY_TABLE 
WHERE C_DT > TO_DATE('2019-01-10 03:39', 'YYYY-MM-DD HH:MI') ;

Open in new window


The date column is not having any indexes and even after creating the index it takes more then 15 mins to execute the delete statement.


So adding the index on c_dt , tired using forall mapping the primary key i.e. c_key1 column.


Here is the sample code which is taking again huge time:

DECLARE
  TYPE t_chk_tab IS TABLE OF my_table.C_KEY1%TYPE;

  l_chk_tab    t_chk_tab   := t_chk_tab();
  l_start     NUMBER;

BEGIN
  -- Populate collections.
  FOR i IN (
  select C_KEY1 from my_table
  WHERE C_DT > TO_DATE('2019-01-10 03:39', 'YYYY-MM-DD HH:MI') ) LOOP
    l_chk_tab.extend;
    l_chk_tab(l_chk_tab.last)     := i.C_KEY1;
    
  END LOOP;

  l_start := DBMS_UTILITY.get_time;

  -- Time bulk delete.

  FORALL i IN l_chk_tab.first .. l_chk_tab.last
    DELETE FROM my_table
    WHERE  C_KEY1   = l_chk_tab(i); 
    
  DBMS_OUTPUT.put_line('Bulk Deletes   : ' || 
                       (DBMS_UTILITY.get_time - l_start));

  ---commit;
END;

Open in new window



Any suggestion on to make it faster, the only issue is I cannot move the data to another table and delete the rows and rename it. As the same table is been used in multiple application for the delete operation all the application will not be using till weekend but still cannot drop the original table
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
1. Does this have to be done in PL/SQL?! You can do it in pure SQL (which you should prefer in this case)!

2. If you have to use PL/SQL: no need to fill your array in row-by-row processing! Just do an "select * bulk collect into..." using the appropriate rowtype array (tab)

3. Are there any triggers active on that table?

4. Can you post the execution plan(s) please?
ste5anSenior Developer

Commented:
A table with 546471 rows is pretty small. Thus your normal DELETE is the correct approach. Here you need to look into the details..

My guesses:
- Do you have a trigger ON DELETE?
- Do you have cascading foreign keys?
- Do you have a general load/performance problem?
- Do you have a bad I/O (path speed, volume speed, volume corruption)?
A couple of good suggestions there.

Just making sure, but both filter clauses look like they intend to delete rows with future dates (after 3:39 yesterday morning).  Is that correct?

15 minutes might be an inconvenience, but it suggests that there's a lot of going on in the delete.  To me, the biggest issue would be that data and index pages get locked during those 15 minutes and that impacts other processes.

You can mitigate that by deleting rows in smaller chunks.  Fewer locks would be held simultaneously, and the locks would be held for shorter durations.  You can test that with this query:

SELECT to_char(c_dt, 'YYYY-MM-DD'), count(*)
FROM my_table
WHERE c_dt  > TO_DATE('2019-01-10 03:39', 'YYYY-MM-DD HH:MI') 
GROUP BY to_char (c_dt, 'YYYY-MM-DD')
ORDER BY 1;

Open in new window


If the query returns a few rows with large counts, a bit of refinement will be needed.  Otherwise, you can build a series of deletes based on the dates returned.
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Top Expert 2008
Commented:
Deletion of more then 300,000 rows is a massive work. So do not be surprised by the time. Also no indexes will help you. The optimizer will ignore them by this amount of affected rows.
To speed up the deletion investigate the size of the REDO logs and the REDO buffer in the SGA. If needed increase their size and the number of the REDO LOG groups.
Next point is to check if there are locked rows used by the applications. This is very common in the busy hours. So the best time for the deletion is the time your instance is not used intensively by other applications.
Mark GeerlingsDatabase Administrator

Commented:
As others have suggested, just using a SQL delete is usually the fastest option.  Adding PL\SQL usually makes tasks like this slower.

And an index will not help you here, not because of the number of rows but because the percentage of the rows you are deleting is over 50% of the rows in the table.  Usually if the percentage of the rows in the table that you need to select (or update or delete) is over about 15% of the total, an index will not be useful.

If the problem is record locks from other activity in the table, then a PL\SQL-based approach that deletes a small number of rows at a time, then commits those then deletes another small group, etc. may be useful.
Thanks a lot experts. The buffer size was low this was doing a overall performance issue. The main challenge is in many cases there access to data dictionary views are not been provided and this really kills time to follow up with folks and get the right details. I sometimes feel like I am blind and watching netflix only sound effect:-) .
Deletion of more then 300,000 rows is a massive work.
Sorry, but this is absolute nonsense!! If this number was factor 100 ok, but this is just NOT BIG!
Obviously schwertner has not dealt with "big data", otherwise I cannot fully understand such misleading comments!
Buffer size and open cursor were low and I agree with you that the amount of data is not huge.
Top Expert 2008

Commented:
@  Alex [***Alex140181***]

Deleting 300K rows can be done on different hardware and on different environments.
As you can see the Instance has small REDO buffer.
So the SGA can be big, can be slow.
There can be many indexes that  will slow down the deletion.
Also the recording of the deleted data in the REDO logs takes time.
So do not underestimate the problem.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial