Link to home
Start Free TrialLog in
Avatar of ajexpert
ajexpertFlag for United States of America

asked on

Delete Performance

A table has 1 trillion records.  

A process deletes all records retaining 100 records.  Stats are gathered on table.  

When we say select * from table will it returns rows faster?  If no, how can we make it faster?

What is the fastest way of deleting records in the above scenario?

Conversely, If we have to delete only 100 records and retain remaining records how can it be done in efficient manner
ASKER CERTIFIED SOLUTION
Avatar of Sean Stuber
Sean Stuber

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Sean Stuber
Sean Stuber

If you don't have an index,  it will probably be faster to do the full table scan rather than create an index and then use that to do the delete.  Provided this is a one-time effort.

If you'll be doing the 100 row delete frequently, then you'll probably want an index.
Avatar of ajexpert

ASKER

Thanks Sean,
So in 1st scenario, (Assuming we have index), will it do full table scan even after stats are up to date?  This table now has only 100 records and may reside in couple of blocks.

If we have to copy 100 rows means we are creating another table.

Though I do agree with your comment of copying the rows, I am trying to understand why the delete will be slower even if stats are up to date
If you are performing delete, oracle will write the deleted records into rollback segments, that will give uou  an option to rollback after the delete and before commit.

if you truncate the table, oracle will not write any rollback segments, and you cannot rollback if you truncate once.

i think the fist option of truncating and inserting 100 records is the better option.
This assumption is not correct: "This table now has only 100 records and may reside in couple of blocks.". Oracle will NOT "defragment" your table just because you delete most of its rows, nor because you gather statistics on it. It will still be e.g. a 1GB table with only 1KB worth of data, contained in 100 rows scattered wherever they happen to be phisically placed (which you cannot control for the most part).

Even if the 100 rows happened to be located in the first two blocks of the table, when Oracle does a full table scan it has to go through all the blocks, because it doesn't know whether they are empty or not.

One easy way to reclaim that space in your table (and all its indexes), is to:

create table temporary_storage as select * from your_table where <something_that_identifies_your_100_rows>;
truncate table your_table;
insert into your_table select * from temporary_storage;
drop table temporary_storage;
execute dbms_stats.gather_table_stats('your_table_schema','your_table',cascade=>true);

Open in new window


Plus, a truncate is executed in a split second... instead, delete must go through every table block plus every index block for every index on the table, mark the data as deleted, generate a ton of undo data, redo for the delete, plus redo for the undo... no wonder it will take much longer.
Akenathon,  

who made the assumption that the 100 rows were in the first few blocks?
I think you're trying to counter an argument that was never made.  :)
In fact, your argument sounds pretty much exactly like what I said.  


ajexpert,
stats don't actually "do" anything for query speed.  They give information to the optimizer on what type of plan to take.   So,  if you have a table with 100 rows and stats that say it has 100 rows  the optimizer will think the table is small and will likely do a full table scan.  But, even if you had a good index, how would "select * from table"  use that index?  What would be the index lookup?  So, good stats, bad stats, no stats,  "select * from table" will still use a full table scan.  And, as I mentioned above and Akenthenon also stated, the scan is of table blocks,  not of table rows.


The most revolutionary and unexpected solution will be to create a new table with the needed 100 record.
After that drop the original table with their one bilion minus 100 records.
Finaly create the original table using that 100 records.
Indexes can help or may not help. It depends on selectivity. In this case the creating of the index will consume significat time.

I 'speculate" with the assumption that dropping a table means to drop all Oracle blocks, but not to walk through the records and of course all dictionary entries on this table.
Sean,

I am not trying to counter argue, (so please don't get angry on me ) but you may look at the your comment ID: 34157341 for the assumption.

Its surprizing to know that table size will be 1G even after deleting most of the data and rows will be "scattered" which we cannot control.

With Oracle 11g don't you think Oracle should defragment the data and by some process Oracle should store these 100 records in contiguous data blocks?

In other words are we saying that High Water Mark is never reset during delete and there is no way of reset HVM?

Thanks
@sdstuber: The author made the assumption. The quote "This table now has only 100 records and may reside in couple of blocks." is from his first comment ID:34157401. My counter was not directed to anything you said.

I agree that we two tend to present similar arguments, I bet everybody who knows Oracle would also think alike. On my post I intended to clarify to the author that Oracle does not auto-defragment tables (or indexes), something that his quoted phrase seemed to imply. However, I skipped the part of "what if I want to delete 100 rows from a huge table" because I consider your answer to be enough to solve that part of his problem at hand.
Ok Sean and Akenathon, I accept its my assumption and my mistake :)

So any comments on the further questions I asked in 34158224?
don't worry, I'm not angry.  That wasn't an assumption,  it was an extreme example.  
I started it with "Even if all 100...."  which I then explained wouldn't matter anyway.    Of course that example "could" happen it's just not likely, but even "IF" it did, you'd still scan all blocks anyway.

"should" 11g move the high water mark on delete?  No, probably not.
Many (most?) deletes are eventually replaced with an insert.
You'll just reallocate the space anyway which consumes resources in the deallocation as well as the reallocation.  So moving the high mark on each delete would just add work while not adding any value most of the time.

You can adjust the high water mark with ALTER TABLE and the SHRINK clause
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
awesome link Akenathon

Thanks Sean and Akenathon.

I never knew that Oracle never frees up the space on "delete" operation ever after gathering stats.

You guys are awesome.
Thanks! One thing to remember: Oracle does not change anything in your segments while it collect stats on them. It just looks at them, and notes down what it sees in a separate place. Then, the CBO (Cost Based Optimizer) uses those stats, but still it won't change your segment. You are the only one who changes it using insert/delete/update :-)