SQL Update statement

codefinger
codefinger used Ask the Experts™
on
Looking for any SQL tips to speed up this Oracle update....much too slow as is...

update emrcontent.indexdata set notepath = replace(notepath,'\\w2k3-ccms1\CCMS.NoteRoot','\\celerracifs\ccms_prod\NoteRoot')
where notepath is not null and to_char(datestored,'YYYY') in ('2003');

VERY large table, many indexes.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2011
Top Expert 2012
Commented:
assuming you have an index on datestored,

change this...

and  to_char(datestored,'YYYY') in ('2003');


to

datestored >= to_date('20030101','yyyymmdd') and datestored < to_date('20040101','yyyymmdd')

Most Valuable Expert 2011
Top Expert 2012

Commented:
also,  what is the ration of null vs not null   notpath?

Author

Commented:
also,  what is the ration of null vs not null   notepath?

I read somewhere that indexes would only be used if the columns to be evaluated did not contain null values.

Could very well have misunderstood what I read.

Gonna try your suggestion...taking out that like clause also and replacing it with substring() =
may help as well.  I started the update at 2:00 this afternoon (on a copy of the database and indexes the dba provided for me to play with)  and it is STILL running as I type this!

Thanks!
OWASP: Threats Fundamentals

Learn the top ten threats that are present in modern web-application development and how to protect your business from them.

Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> read somewhere that indexes would only be used if the columns to be evaluated did not contain null values.

sort of true.

an index node consisting of nothing but nulls will not be created

in a composite index, if some are null and some are not then it still will be

however, there are extremely efficient ways to force indexing,  but they only apply if there is a significant skewing toward them.


>>> taking out that like clause also and replacing it with substring() =

what like clause?  is the query above not what you are trying to adjust?




Author

Commented:
what like clause?  is the query above not what you are trying to adjust?

That query is correct.  I was thinking of an earlier attempt, sorry, brain f_rt.

Commented:
How many rows are you trying to update? And how many rows does the table have?
Are you sure that this query is covered by an index?
If you use Oracle SqlDeveloper, then I suggest you to check it with "Explain plan (F10)":
Explain planIt will show you the way your query will be executed.

Also note that if the predicate field is a "date" (type: date), you better convert your filter to date, rather than the field:
where notepath is not null and datestored between to_date('20030101') and to_date('20031231');

Open in new window


Hope this helps.
Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> datestored between to_date('20030101') and to_date('20031231');

this was mentioned in the first post, but note  the BETWEEN isn't correct, because it will exclude all data
for  00:00:01 - 23:59:59 of Dec 31st.  "IF" all of the data is truncated to the day or larger granularity, it will work but nothing in the question indicates that.  The original suggestion will work if the data is truncated or not

Commented:
to sdstuber:
Yes, you're right. I didn't see your post, and yes, my query will exclude dec. 31 if the data type is not "date" but rather "datetime".
Most Valuable Expert 2011
Top Expert 2012

Commented:
there is no such data type as "datetime" in oracle,  there are DATE and TIMESTAMP (with and without timezone and of various sub-second precision)  

all of which always have a time, even if truncated to 00:00:00

Commented:
I got confused with SQL Server types, where both date and datetime exist.
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
How many rows are you expecting this to update?

You might be able to increase performance by breaking up the updates in a loop with frequent commits.

This way the transaction doesn't need to hold a lot of UNDO.
Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> You might be able to increase performance by breaking up the updates in a loop with frequent commits.


not really,  doing this actually consumes more resources over all.

it can lead to faster incremental results and faster rollback in the event of failure.

However, it can also cause data inconsistency since you are breaking an atomic transaction
and it also leads to more frequent "snapshot too old" errors
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
>>not really...

'can' being the operative word here.  It can benefit in specific instances.
Most Valuable Expert 2011
Top Expert 2012

Commented:
>>>  It can benefit in specific instances.

I don't see how.

If an update is supposed to modify N rows

I have to start a transaction
find those N rows
change those N rows
commit the transaction.

If I break the update into pieces

I have to start a transaction
find  X rows
change those X rows
commit the transaction
Start another transaction
find N-X rows
change those N-X rows
commit the transaction

Using 2 or more updates does all the same work that using 1 update does, except it also requires additional transaction work, and it requires starting over on the data searching each time.


You can get the illusion of speed, because a partial update will show partial results sooner, but you still have more work to do, if consistency is important then partial isn't helpful and may be harmful.
If consistency is not important, then it's still just the illusion of speed since it's not really done yet.

If you have failures, you do consume less rollback/undo, so you can correct the error and start over more quickly, but in this case, a failure means you're not actually doing the update, so it doesn't really apply.

If there is a counter example, I would love to see it but I don't see how it could be possible given the nature of the operations required.



When you use is not null the ORACLE server perform a full table scan.
Try: where notepath > min_value,
where notepath >= '\\' (eg.)

update emrcontent.indexdata 
set notepath = replace(notepath,'\\w2k3-ccms1\CCMS.NoteRoot','\\celerracifs\ccms_prod\NoteRoot') 
where notepath >= '\\'
  and datestored between '2003-01-01-00:00:00.000' and '2003-31-12-23:59:59.999' ;

Open in new window

Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> When you use is not null the ORACLE server perform a full table scan


not necessarily,  especially when there is another condition


also   where somecol > {some min value }

should generate a full table scan as well if you have lots of not null values
Most Valuable Expert 2011
Top Expert 2012

Commented:
>>> should generate a full table scan as well if you have lots of not null values

I apologize, that was sloppy of me.  I should have said...

if you have null values across lots of data blocks


the actual number of rows isn't particularly important for that type of condition
it's the number blocks needed

Author

Commented:
16 million rows.  This suggestion dropped the time to update from almost a day to almost a minute.

THANK YOU!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial