mgferg
asked on
cursor validation/improvement
Hi all,
I have procedure/cursor that does the following
selects all row ids (say from table a) where the same id is in another table (say table b) and also matches an additional criteria.
Then iterates through these and updates a column in table a (say mycol) to a value
I have a fetch todo bulk collect into v_id limit 5000; so it does 5000 at a time and commits
It also needs to write to dbms_output.put_line the id it has changed
The attached code works (although modified to be generic, so hopefully no typos)
however it is taking a few hours to run where there are 3.8 million rows in table a and 2900 rows in table b
This is getting run from a sql script as exec test;
(Oracle 10g)
So questions are:
1. With a small amount of test data it runs through and immediately shows all entries it is updating. With the numbers as above there is no sign that anything is working (currently) which is partly the intention of the dbms_output.put_line and limit 5000. Why don't I see each row or group of rows getting updated? I can however see the sqlplus process running with no further output (yet)
2. It needs to commit after updating each 5000 rows and also right at the end (I might have an extra/too many commits?)
3. How can this code be improved or are there any flaws in the way this has been done?
Thanks in advance. Any assitance/comments appreciated.
Mark
(newbie)
I have procedure/cursor that does the following
selects all row ids (say from table a) where the same id is in another table (say table b) and also matches an additional criteria.
Then iterates through these and updates a column in table a (say mycol) to a value
I have a fetch todo bulk collect into v_id limit 5000; so it does 5000 at a time and commits
It also needs to write to dbms_output.put_line the id it has changed
The attached code works (although modified to be generic, so hopefully no typos)
however it is taking a few hours to run where there are 3.8 million rows in table a and 2900 rows in table b
This is getting run from a sql script as exec test;
(Oracle 10g)
So questions are:
1. With a small amount of test data it runs through and immediately shows all entries it is updating. With the numbers as above there is no sign that anything is working (currently) which is partly the intention of the dbms_output.put_line and limit 5000. Why don't I see each row or group of rows getting updated? I can however see the sqlplus process running with no further output (yet)
2. It needs to commit after updating each 5000 rows and also right at the end (I might have an extra/too many commits?)
3. How can this code be improved or are there any flaws in the way this has been done?
Thanks in advance. Any assitance/comments appreciated.
Mark
(newbie)
ASKER
thanks - missed the part when I uploaded that .sql not listed
test.txt
test.txt
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
if call the DBMS_OUTPUT in procedure than you have to set the buffer size large enough to handle the message. Otherwise, an error occurs and procedure does not complete.
default, SERVEROUTPUT is set to OFF. When activated, default SIZE is 2000. The valid range is 2000 through 1000000.
better u create the file and write inside the file .. or create temporary table and insert the detail as you want...
b,coz you r inserting million of the rows.. chance of buffer will orverflow
default, SERVEROUTPUT is set to OFF. When activated, default SIZE is 2000. The valid range is 2000 through 1000000.
better u create the file and write inside the file .. or create temporary table and insert the detail as you want...
b,coz you r inserting million of the rows.. chance of buffer will orverflow
ASKER
Thanks for the effort in making the suggested change.
The one issue I'm having now is that I think the initial select ta.id from table_a ta, table_b tb
where ta.col_id = tb.col_id and ...;
is taking a very long time/not completing - been running 12 hours now. Perhaps this can be broken into smaller chunks also but not sure (?)
The one issue I'm having now is that I think the initial select ta.id from table_a ta, table_b tb
where ta.col_id = tb.col_id and ...;
is taking a very long time/not completing - been running 12 hours now. Perhaps this can be broken into smaller chunks also but not sure (?)
a. try optimizing the first query
establish the access path oracle uses
(explain plan, enterprise manager) , to check if the correct indexes are used)
select ta.id from table_a ta, table_b tb
where ta.col_id = tb.col_id and ...;
b. if the id's are distributed you can loop them in chunks
declare
nr_of_chunks pls_integer := 800; -- choose yourself
--assumption max id < 8 million and min id > 0
chunk_size pls_integer :10.000; -- choose yourself
-- but you will have to cover all possible id-value
id_from table_a.col_id%type := 0;
id_to table_a.col_id%type := 0;
begin
for i_chunk in 0 .. nr_of_chunks loop
id_from = id_to + 1;
id_to =( id_from + chunk_size * i_chunk)
declare
select ta.id from table_a ta, table_b tb
where ta.col_id = tb.col_id
and ta.col_id between id_from and id_to
and tb.col_id between id_from and id_to
and ...
begin
.....
end;
loop
c) create a pl/sql array of the smallest table using the id as an index
access table_a directly without a join an check the join condition in the loop by accessing the
the internal table.
Normally the oracle join is better but sometimes this is a way to escape the effort of accessing the oracle-indexes or the memory needed to take care of joins.
(of course at the expense of accessing the internal array)
establish the access path oracle uses
(explain plan, enterprise manager) , to check if the correct indexes are used)
select ta.id from table_a ta, table_b tb
where ta.col_id = tb.col_id and ...;
b. if the id's are distributed you can loop them in chunks
declare
nr_of_chunks pls_integer := 800; -- choose yourself
--assumption max id < 8 million and min id > 0
chunk_size pls_integer :10.000; -- choose yourself
-- but you will have to cover all possible id-value
id_from table_a.col_id%type := 0;
id_to table_a.col_id%type := 0;
begin
for i_chunk in 0 .. nr_of_chunks loop
id_from = id_to + 1;
id_to =( id_from + chunk_size * i_chunk)
declare
select ta.id from table_a ta, table_b tb
where ta.col_id = tb.col_id
and ta.col_id between id_from and id_to
and tb.col_id between id_from and id_to
and ...
begin
.....
end;
loop
c) create a pl/sql array of the smallest table using the id as an index
access table_a directly without a join an check the join condition in the loop by accessing the
the internal table.
Normally the oracle join is better but sometimes this is a way to escape the effort of accessing the oracle-indexes or the memory needed to take care of joins.
(of course at the expense of accessing the internal array)
1) the results of dbms_output are shown after the pl/sql is ready , so dbms_output is not suited for showing intermediate results
for intermediate results you can
a) write to an operating system file (see UTLFILE-package)
(depening on the system you will have to open en close the file after each written line
to see the last written results)
b) write to a 'results'-table in the oracle database that can be queried to view the results
c) there is package dbms_pipe by means you can sent messages that can be read by another
session in the database
d) the package dbms_application_info provides some procedures by means you can change some columns in the v$session-view that can be queried in sqplus or viewed in tools like enterprise-manager
2) you can code your commits in the pl/sql-procedure
3) no code available
the limit splits the fetch and updating work but not the open of then cursor