Link to home
Start Free TrialLog in
Avatar of jknj72
jknj72

asked on

Procedure I wrote and am trying to optimize

I would like to post a procedure that I wrote and see if someone may have a way to optimize what I have written. It is running extremely slow and is not really acceptable, in my opinion. I have a source query that I do a BULK COLLECT INTO a TYPE. Then with those values I loop through each record and set parameters to get an Address and I then set a bunch of variables and format the variables and ultimately I do an Insert into a table. There has got to be a better way of doing this. Maybe its the functions Ive created that take time to return, Im not sure. I would love to be able to do an Insert from Source to Destination and then maybe format/Update the table after the records are in the table? Any help would be great. I have attached the file to hopefully make it more readable. If you would rather me post it please just ask for anything you need. Im attaching the procedure but if you need to see the functions Im using just let me know.
Please help!!  This is my first package ive ever written and my first assignment at a new job so any help would seriously be greatly appreciated....Thanks everyone...
PCOV-Policy-V1.txt
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

First quick scan through it...

First thing I would do is take the advice we have previously offered:
-Do away with in-memory tables.
-I'm still not sure why you can't use a Global Temporary Table.

What does the table TEMP_BASE_PCOV_POLICY look like?
If the columns are varchar2, why all the RPADS?
Make them CHAR and columns get right padded automatically.

At least make the variables CHAR for this reason:
declare
	junk1 varchar2(10) := 'Hello';
	junk2 char(10) := 'Hello';
begin
	dbms_output.put_line(':' || junk1 || ':');
	dbms_output.put_line(':' || junk2 || ':');
end;
/

Open in new window


All these can be combined:
 l_COV_ISSUING_STREET := CLEAN_NON_ASCII(l_POLICY_TAB(i).CARR_ISSUING_STREET);    
                IF LENGTH(l_COV_ISSUING_STREET) > 50 THEN
                  l_COV_ISSUING_STREET := SUBSTR(l_COV_ISSUING_STREET, 1, 50);
                ELSE
                  l_COV_ISSUING_STREET := RPAD(l_COV_ISSUING_STREET, 50, ' ');
                END IF;

Something like:
rpad(SUBSTR(CLEAN_NON_ASCII(l_POLICY_TAB(i).CARR_ISSUING_STREET, 1, 50),50,' ');
Instead of reviewing what you gave written (which appears to be overly-complex) I would like to focus on the task you are trying to do, and find the simplest way to do that.  It looks like that is mainly:
"do an Insert from Source to Destination" and a secondary task:  "then maybe format/Update the table after the records are in the table".

To do "an Insert from Source to Destination" you don't need PL\SQL and: a query, a bulk collect, a type, a loop, etc.  The simplest (and fastest) way to do this is usually in simple SQL like this:
insert into [target_table]
(optional list of columns)
select [whatever you need]
from [wherever the data is now]

In some case, if you need to do complex data manipulations, conversions, validations. etc., it may be necessary to use PL\SQL and/or a global temporary table to hold intermediate data.

Can you describe the task you are trying to do?  Then we can help you determine the best approach to the problem.  That may not involve PL\SQL at all.

I hope your goal is not trying to design complex solutions to problems.  I hope your goal is trying to solve business problems in the simplest possible manner.  That is what we will try to help you do.
Avatar of jknj72

ASKER

Mark, I don't see why you would ask  "I hope your goal is not trying to design complex solutions to problems.  I hope your goal is trying to solve business problems in the simplest possible manner?"

 Why in the world would I want to do that? I am in fact here asking this question to solve my problem in the simplest manner. Not sure what other reason I would be here?

My goal is simple. I have to get data from a number of tables and with certain logic, rules and data requirements in place into a destination table.

I am relatively new to Oracle. Ive written statements in the past but this is the first time I've written a package and the procedures that run the code. I don't even know how I can get it to run in a certain sequence when I'm done but I will cross that road when I come to it..

All I was asking is maybe there are some better techniques that I can use instead of what I currently have in place.

Slight, you commented that I shouldn't use the In Memory processes and I actually started to move in that direction in the beginning of this project but I had a hard time with it so I went back to the original approach, which is what the other packages that I've looked at are using and this is what I came up with. Also, there are 3 of these tables that Im creating with 3 packages but this is the main one. The other 2 read from the table Im Inserting into that I sent in this question as the source so it is local and runs very fast. Its this first one that is taking the majority of the time.  Im new here at this job so I figured it may be best to complete the task and then try to optimize it the best I can.  The old task takes 8 hours and mine takes 2 hours so its definitely an improvement but Id like to get the whole process under an hour. I will definitely take your advice with the padding and changing types to Char. With that said, I actually was just given the ability to debug and for the past 2 days I have gained so much momentum because I now know what were some of my issues. So now that I have these capabilities I can change some of the code and will be able to tell if Im doing it correctly in a very timely fashion.
Slight, when you mentioned using the Global Temp tables. How would I use them? I am assuming to Insert the Source query data to a Temp table and read, manipulate and format the data from that table then do an Insert?  The codes all there I just want to use the best methods of moving the data around for my needs. Thanks
Don't be that hard on Mark.  He might not have seen the series of questions that lead up to this one.  We have quite a few members that join and expect the Experts to write all the code for them for their membership payments.  Basically very cheap contractors...

Global Temporary Tables are special tables in Oracle that are created once and used over and over.  Only the session that inserts into them can see the data and once the session ends, the data is automatically purged.

The data can also be purged on a commit depending on how you create them.

As long as the subsequent processing is done in the same session, I would look into them.

Even if you don't use them, you will gain a little performance on the regular table if you created it NOLOGGING.  This means no redo is generated for DML on them.  The downside is they aren't recoverable in a crash but since this is a temp table, it shouldn't matter!

The docs have what you need on them and there are several papers/blogs out there that talk about them.

>>which is what the other packages that I've looked at are using

Just because everyone else jumps off a bridge...  ;)

>>but I had a hard time with it so I went back to the original approach,

So, when you took off the training wheels and skinned your knees, did you put them back on?

When you stumbled, you should have asked us!

My gut feeling is that ALL of the core logic can probably be done with simple SQL.

OK, maybe not all that simple but at least all in SQL and no PL/SQL logic.

The inserts into the MSG table might be an issue and still likely require logic.

An explicit cursor like I showed you in a previous question will be better than the in-memory table and can include MOST if not ALL of the padding/cleaning/etc... logic.

Then you can do your individual checks and log 'bad' data.

Also pretty sure the address logic can be simplified.

All that said:
We'll need simple test cases that simulate your setup.  Make them as simple as possible and still representative of what you need.  Sorry but we cannot understand the entire big picture...   That is unless you hire some of us (there is a Hire Me button in some of our profiles...  but don't look in mine...).

Table structures, sample data and expected results.  This way we can create the same thing on our end and provide tested code. If the logic isn't obvious in the data, a brief explanation will help and save some time.
Avatar of jknj72

ASKER

Understood. Didn't mean to offend, maybe I took it the wrong way..
My comment here was based on this question by itself.  I did not assume that this question was connected or related to any possible previous questions on this site.  When you told us that this is the first PL\SQL package that you have written, and that this is a new job for you, I was concerned.  You asked for help on improving this package, but this is just one possible tool (or approach) to solving a problem.  Since we don't have a complete understanding of what the business problem is, we don't know that this package is the best way to solve the problem.

I wanted to take step back and have you give us a high-level description of the problem.  Then we can much more effectively decide if using a PL\SQL package like this is the best method to use for solving this problem.  Otherwise, we can spend time and energy on tweaking your package, but only later realize we should have taken a totally-different approach to the problem.

When I was learning this line of work on-the-job many years ago (I have no college training related to computers or programming) I had a mentor who encouraged me to always try to find three possible ways of solving the problem, consider the pros and cons of each, then choose the best (or in some cases, the least-worst) option.

In this case, slightwv and I both mentioned direct SQL inserts and a global temporary table, since we have both used those options for business problems that (to us at least) sound similar to what you are trying to do.  Have you explored that possibility and determined that it is not a good option for your business problem?
Avatar of jknj72

ASKER

Its actually not a business problem, we are just trying to improve the time that it takes to get this process completed. I am re-writing the existing program that generates the tables that are being picked up by a vendor of ours to do what they have to do.

Mark...
>>When I was learning this line of work on-the-job many years ago (I have no college training related to computers or >>programming) I had a mentor who encouraged me to always try to find three possible ways of solving the problem, >>consider the pros and cons of each, then choose the best (or in some cases, the least-worst) option.

I really like what your mentor said to you...I would have loved that kind of direction by a senior guy when I was starting off. My mentor was the worst. He would cause problems with all the database guys(I was a front end developer when I started out) and I was attached to him , like his sidekick, and it made for very uncomfortable situations. If he didn't get his way he would cry and b1tch until he got his way, very childish. He was a used car salesman when he wasn't programming and he used coding techniques that were very outdated and would fight against using something new that he didn't know. He got me some good jobs though, throughout the years, but his ego always got in the way and surprisingly enough they always ended up firing him and keeping me(at least 3 positions I can think of). I worked with him at my last position and they got rid of him shortly after I got hired..

I've helped people that Ive worked with and spoke to them years later and they always bring up that I was able to teach then in such a simple manner that  they still use the same techniques I taught them 10 years ago. I love helping people and feel I missed my calling and should have been a teacher. I actually help people here at work and they now all come to me with questions because I am always willing to help.

But Annnnnnway, sorry for being a chatty Kathy!!

The program runs a query and BULK COLLECTS INTO a TYPE table variable and loops through the records. I set variables and the size of the vars are truncated or padded(which I will change Slight..). I bypass records that don't have appropriate values or Dates in some of my checks. I also do a build of a key with a lot of the vars being concatenated. That key will be used in my next processes so I can join the tables later on. I use the table being built in this process as the source for my other processes.  
 
I then have to check tables for an appropriate address to use with this Policy record. There is a sequence of tables to check to get the Address needed for the Insert(Only 1 address for each Policy).  First table is the ENDORSE query(Using the Bulk Collect again) with params from the policy and if there is a record, I go through the checks and finally do the insert. The Address logic is not really complex but there are many checks(checking dates and querying tables for values via functions) and formatting before the Insert. The second address check(if no return value from check1) is from the INSURED_LOCATION table and there is an ADDTYPE that we check and if its a 2 that's the preferred Type, so we use that one, if not we use ADDType 1. After final vars are set then we do the Insert.

I am about to leave work for a weekend getaway so I wanted to at least respond to you guys before I left.
With the main logic I just explained when and where would you use the Temp table? I'm not asking for anyone to write the code for me, Ill never learn that way, but if you could maybe steer me in a better direction Id appreciate it.

Also if you need table structures or function code just let me know and I can send you a script. I may have to do that Monday though.

Thanks guys and have a great weekend!!
ASKER CERTIFIED SOLUTION
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>>I will add that I am definitely not a fan of CHAR variables or columns!

Columns, I agree 100%.  Variables, I'm on the fence.

Since you aren't storing the values permanently, and want right padded values, I'm not sure why you wouldn't use them as variables instead of rpad'ing them when necessary.  Well, unless you have millions of rows in memory (hint hint BULK COLLECT!!!!!).

Anyway, this should be an academic point since we both agree there should be VERY MINIMAL variable use here.

>>Is that your situation?

Sounds like it from the comment:  "generates the tables that are being picked up by a vendor of ours to do what they have to do"

I'm guessing this is part of a larger ETL workflow.


>>No, I didn't see slightwv's latest comment before I posted this

hehe...  happens to ALL of us!!!
I'm curious.

What all did you end up doing and did you get it down to an hour or less?
Avatar of jknj72

ASKER

Well, I finished the code that I had sent to this thread and was able to validate everything I needed and the only errors I get are NO_DATA_FOUND, so they are legit errors and are being captured and everything is working fine. I used advice from both you and Mark, hence the split of points, but with the help of the debugger(and your advice, even from earlier questions/answers), I did an Insert into a staging table with the idea that keeping a connection to another server/tables open and doing the BULK COLLECT INTO a TYPE were not optimal. So I run pretty much the same query but Inserting it into a Stage table and then opening up the, now local, table using a FOR LOOP. I still have BULK COLLECTS INTO in the address portion but thinking on changing that as well. It is getting run for every record in the source FOR LOOP so I think using another FOR LOOP would be better than loading a TYPE for every row. What do you think? I'm wondering if there is anything else I can do to speed up the process(index the Stage table) or do you think the global temp table would speed up the process?...Maybe I will ask another question about it
Although I feel much better about the new code I have just finished I have not noticed a significant difference in the processing time. Its definitely a little faster but it still is taking  about 2 hours to finish. I eventually would like to reduce the amount of code by trying to get this into an Insert statement with the majority of the logic being performed within the statement(CASE WHENs and WHERE clause) is possible. I still think I will have to do an Update to the table but I will see what I can come up with.
Thanks again for your help!!
Thanks for the update.

I'm still thinking there is no need for any of the staging tables (either in-memory, global temp tables or regular tables).

I still think all  the address logic can be simplified if not moved completely to native SQL.

Keeping an index on a staging table can be problematic because of the updates for each row.  If an index would help with a query of the staging table, try building it AFTER you have loaded all the data.

Indexes help with random and specific access on a table.  If you query the staging table based on specific rows, then an index might help.

If you access ALL the rows only once, no index will help and will only take time to create/maintain.
Avatar of jknj72

ASKER

yeah I think your right. I indexed the table and didn't find any increase in the time it takes to run.
I hate to sound stupid but when you say
"I still think all  the address logic can be simplified if not moved completely to native SQL."
What exactly do you mean by that? Im going to try and get all the address logic figured out and then do an Insert with everything included, without the Update to the table. If you have any tips please feel free to send them to me.

Thanks
John
Avatar of jknj72

ASKER

Im gonna ask  question about some of the logic of the Insert Im gonna try and do
>>What exactly do you mean by that?

From the quick scan of the logic, I think all the checks to return the 'correct' address values can be moved into the BIG select with CASE logic.

I'm thinking the entire thing can be one large
insert into ... select ...

Unless you really need the tracking inserts, then one large FOR LOOP with very little actual pl/sql code.
Avatar of jknj72

ASKER

Ok I was just writing the question. I have my select statement and I am going to put that after my insert and put what I have in a question and let you see where im at and any advice you could give me would be great...Will post question soon...Thanks