Performance issue in DBI


I'm using perl to extract a lot of data from an oracle database and printing this to a file. I'm using the DBI library to connect to oracle.
My sql in the prepare statement is a join between 4 tables (where all access is made on indexes using index hinting where appropriate), and returns approx. 45K of rows (about 1/3 of total data). I have run this sql in sqlpluss where it returns my result within reasonable time - approx 2 min.

If i implement this into perl i get the following results:
The execute() statement takes about 10 minutes to finish which is slower than above.
But my biggest problem is that when i loop through the results it starts to hang. I started to print the times of every 1k rows and realised it never got that far. I then printed every row. It did about 25 - 30 rows more or less instantly and then it stalls.
I have tried to use fetchrow_hashref() and fetchrow_array() (don't know if there is a performance gain between the two??) and i have tried to both manipulate the data on the fly and also store them locally (both in hash and array) before proceeding but with the same results.

I have used the same approach before with large amounts of data (mind you only on 1 table access, possibly only a join between 2 tables).

Does anyone know why this is happening and preferably a solution?


KjartanM-SDomain ArchitectAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

show us some perl code exhibiting the problem. I routinely do large data transfers with complex joins, and never run into any unusual performance loss.
KjartanM-SDomain ArchitectAuthor Commented:
sorry for the delay....

Here is my statement handler:
$customer_info_nodate_sth = $dbh->prepare(q{
                        /*+ INDEX(ad ACCOUNTDETAILS_PK) */
            from   account a,
                  accountdetails ad,
                  contactdetails cd,
                  accountattributes aa,
                  contact c
                  aa.account_type in (2, 4, 6, 8, 90, 99)
            and   a.account_num = aa.account_num
            and   ad.account_num = a.account_num
            and   ad.start_dat < sysdate
            and   (ad.end_dat is null or ad.end_dat > sysdate)
            and   c.customer_ref = a.customer_ref
            and   c.contact_seq = ad.billing_contact_seq
            and   cd.customer_ref = a.customer_ref
            and   cd.contact_seq = ad.billing_contact_seq
            and   cd.start_dat < sysdate
            and   (cd.end_dat is null or cd.end_dat > sysdate)

Here are some of the methods that are involved:

sub process {
print localtime()." INFO: SQL er executed\n";

sub process_task {
                my $info_sth = shift;      
      my $type = "I";
      print localtime()." INFO: Now starting to process all records found\n";
      while(my $curr_update = $info_sth->fetchrow_hashref())
            my $account_num = $curr_update->{ACCOUNT_NUM};
            my $customer_ref = $curr_update->{CUSTOMER_REF};
            my $acc_status = $curr_update->{ACCOUNT_STATUS};
            my $account_type = $curr_update->{ACCOUNT_TYPE};
            my $in_use = $curr_update->{IN_USE};
            my $address_seq = $curr_update->{ADDRESS_SEQ};
            my $contact_seq = $curr_update->{CONTACT_SEQ};
            my $first_name = $curr_update->{FIRST_NAME};
            my $last_name = $curr_update->{LAST_NAME};
            my ($address1, $address2, $zipcode, $address4, $country) =
                  get_additional_info($curr_update->{CUSTOMER_REF}, $curr_update->{ADDRESS_SEQ});
                  $params{RECORD_TYPE_CUSTOMER}, $account_num, $customer_ref, $acc_status, $account_type, $in_use, $contact_seq,
                  $first_name, $last_name, $address1, $address2, $zipcode, $address4, $country, $type

sub get_additional_info {
# another select to get some more info

sub add_record {
#print the line to a file and count the number of lines printed

I've had a couple of variations of process_task here, where i have used the while loop to read into a local hash and then looped over that and fetched the remaining information, etc. I have also tried to splitt the statement into two SQLs, so that i only join on two tables for each sql. The second sql then perfoming a lookup on indexed values only. I don't know which is the best way of doing things. Having one SQL to fetch all the data or one to fetch all ACCOUNTs and then execute 45k of sql after that to get remaining information?


KjartanM-SDomain ArchitectAuthor Commented:
PS: I noticed the duplicate aa.account_type reference in the select statement. It is a copy & past error to the was not part of the sql that i have executed.
Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

I  doubt the problem is with DBI.  Do you have indexes on all of the columns?  You might try an explain to check on that.  If you've eliminated that as a problem you may want to switch to fetchrow_array which is faster than fetchrow_hashref.  Its far less convenient or resilient to databas e changes so unless you have to go that way its better to stick with your existing code.
KjartanM-SDomain ArchitectAuthor Commented:
The only column that is not indexed and is a limiting factor on the select is aa.account_type (aa.account_num is indexed, but...). This results in a full table scan in this table. This cannot be helped, unfortunately, but i often use that column as a criteria in my sqls and it has always performed well before.

I have also tried fetchrow_array() in my implementation. My problem seem to be that it iterate throught the first 30 records then it all perl at least....I can see that the oracle process on the server is running for full power.

I have noticed something similar in TOAD (if anyone is familiar with this - sqlpluss tool). After executing an sql it displayes the 30 first results very quickly, but if you try to view the whole result set then it takes time to fetch it......but usually no more than a couple of minutes depending on the size of the resultset.

The SQL may be the limiting factor.  Oracle can be a real pig.  You might want to load the data into MySQL and see if your situation improves.

If you're sure its a Perl issue, we need more of an idea of where the code is chewing up time.  There are several profiling tools available (do a search on  Try one of those and post the results.  :)

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
KjartanM-SDomain ArchitectAuthor Commented:
I don't have the possibility to load the data into another DB.

I tried to use a profiler, but with no luck. It kept bugging me with unloadable objects in a module.......

However....this is what i got as output from running the script:
Thu Nov 11 13:23:20 2004 Export file /home/test4/external/of_export_out/CUS_20041111_1323_TMS opened
Thu Nov 11 13:23:20 2004 INFO: Script running for first time
Thu Nov 11 13:36:39 2004 INFO: SQL er executed
Thu Nov 11 13:36:39 2004 INFO: Now starting to process all records found
Thu Nov 11 13:36:39 2004 INFO: Copy 40000000003 to local storing
... (44 records)
Thu Nov 11 13:36:39 2004 INFO: Copy 40000000062 to local storing

I have tried to let it run for about 24 hrs without it proceeding from this point. This is however from the version where i tried to copy all the data into a local hash before fetching addintional information.

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.