Solved

Activestate PERL hangs

Posted on 2009-05-20
12
554 Views
Last Modified: 2012-05-07
Been having problems with PERL on my Windows Vista dev machine for some time now. It's a nuisance but I can normally get around it. Now I'm finally sick of it.

Can't figure it out, so looking for any outside-of-the-box ideas that might help me find the source of the problem. Of course, if anyone actually knows the solution, even better!

When I run code in a loop, especially loops with higher iterations or intense object oriented flavoured code within its scope, I occasionally have PERL lock up on me. Actually "occasionally" implies inconsistence, but if I run into a jam, it will always jam until I do something about it (limit the iterations for example) which is why I say I can "get around it" and that "it's a nusiance". The same code in production (Solaris 10, SunOS 5.10) runs perfectly, only on my dev machine do I have trouble.

Was running PERL 5.8.8 build 820 but upgraded to 5.8.9 build 825 in an attempt to fix it with no luck. Production PERL is " v5.8.8 built for i86pc-solaris-64" .

So, some symptoms:
- To my knowledge, has only ever hung inside of a loop (except for one possible exception tonight).
- The loops have always been iterating through database results (but then I very rarely work with other loops)
- Browser is eternally waiting for a response (no headers passed back)
- The PERL process stays in task manager, and the 'memory' usage always stops still (no fluctuating) and no CPU time is taken. Memory usage seems to stop anywhere from 20MB to 30MB.

The thing that bugs me though is that it doesn't just die & throw an error, it just gives up & sits there. Tongiht I'm pretty certain it actually made it out of the loop before giving up too, because the database transaction that enclosed the loop was committed safely, unless the peculiarity I'm experiencing has an accomplice, which is MySQL.

Normally I have to kill the PERL process (which then lands me with a "page cannot be displayed" in IE) then limit the database loop to 1 result (or maybe a lot more, depending on the intensity of the enclosed code).

I'm leaning towards thinking it's a memory problem, perhaps a limit defined within the dev environment - Apache or maybe PERL itself (but darned if I can find it)  -. Couldn't tell you why, it just smells of it. Probably because I'm intuitively realising there's a fair bit going on normally. Either looping thousands of times over simple objects and method calls, or (less) times over heavy object usage including looping through child objects thereafter. Somehow I think the overhead of each object is getting to it. But the load is NOT massive. It's a test environment, with a database restricted to a very bare-boned data set a fraction of the size of that seen in production, yet production never skips a beat? Neither environment uses mod_perl.

But I digress...
Any ideas? You're allowed to be (realistically) imaginitive in suggestions.

I've included a completely contrived sampling of code that might experience the problem, soley intended to reveal my approach. Maybe there's a way I could vastly optimise my logic..?

# FLAG FEES TO CHASE UP

# Find all suspended buyers of product with outstanding invoices where they

# have a valid bank account and a record of paying in the past, and keep track

# of the fees on these outstanding invoices <for some reason>

#

use Database;

use WatchList;

use Entity;
 

my $db = new Database();
 

# Try to find fees & add to watchlist (wrap in DB transaction)

$db->startTransaction('Example');

eval

{

	for my $entityId (

			map { $_->[0] }

			@{

				$db->all("select e.id from entity

					inner join role as r on r.entity_id=e.id

					inner join statement as s on s.entity_id=e.id

					where r.type='buyer' and

					s.type='invoice' and s.paid='1'"

				)

			}

		)

	{

		my $buyer = new Entity( id => $entityId );

	

		# Only interested in suspended buyers

		next if (! $buyer->isSuspended() );

	

		# Get buyer's preffered debit account & check that it appears valid

		my $account = $buyer->getPreferredAccount( role => 'debit' );

		next if (! $account->isValid() );

	

		# Find all UNPAID FEES on outstanding INVOICES owned by this buyer

		my @fees;

		for my $statement ( grep { $_->isInvoice() }, $buyer->getOutstandingStatements() )

		{

			push @fees, grep { !$_->isPaid() }, $statement->getFees();

		}

	

		# Create a watchlist attached to this buyer, overseeing fees

		# for which we need to chase payment

		my $watchList = new WatchList( associate => $buyer );

		map { $watchList->add($_) } @fees;

	}
 

	# All done, try to commit changes!

	$db->commit('Example');

};
 

# Catch any errors

if ( $@ )

{

	# Uh oh.

	$db->rollback('Example');

	croak "Could not add fees to watch list: $@";

}
 

# Print results ... etc. ?

Open in new window

0
Comment
Question by:Glauron
  • 8
  • 4
12 Comments
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
Well, if it made it out of the loop, then you aren't showing the code that caused the problem.  

I'm assuming that the Database, WatchList, and Entity are all modules you've developed.  You didn't post the code there, so can't tell what they are doing.

You might try creating a log, and have it log what it's doing continually.  Something like this.  Then when it locks up, look in the log to see exactly where it got.  That might help you determine where it is hanging.

#At the top of your script, open log, unbuffer output to log

open(my $LOG, ">>c:/temp/log.txt") or die "Log: $!\n";

flock($LOG, LOCK_EX);

select((select($LOG), $| = 1)[0]);
 

#Then your normal code, print to log

print $LOG "$$:Creating Database object: begin";

my $db = new Database();

print $LOG "$$:Creating Database object: done\n";
 

...

print $LOG "$$:Getting all entity id's: begin\n";

for my $entityId (

  map { $_->[0] }

    @{

      $db->all("select e.id from entity

                inner join role as r on r.entity_id=e.id

                inner join statement as s on s.entity_id=e.id

                where r.type='buyer' and

                s.type='invoice' and s.paid='1'"

               )

    }

  )

  {

    print $LOG "$$:Processing entity ID $entityId\n";

Open in new window

0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Nah the code above I made up to kind show the general makeup of a typical loop that might fail. WatchList doesn't exist (although Database & Entity do).

I've already tried logging statements and would even be able to find exactly which iteration would hang, and on which line (or even which line of an object's method called during that loop). There is never anything abnormal about where it would hang. For example, the 'offending' line of code might be a sum, like a $total += $value; or some such. And I've dumped (Data::Dumper) everything currently being looked at just before it hangs & there's nothing strange about anything there either. If Imove the dumping/croaking statement to the other side of the 'offending' line, it never makes it there.

In fact, if it's the 'n'th set of data being looked at that fails, if I go back select only that records or the records around it, it works no problems. If I START from lets say n-10, record 'n' will work, but it will hang further down the line.

"Well, if it made it out of the loop, then you aren't showing the code that caused the problem"
That's pretty much exactly the problem :)
If there are only a handful of records it's looking at, it will throw the expected page up in seconds, but add some more records, and it will hang "somewhere". And by the way, this isn't just any on particular given loop, whenever I'm writing a loop that uses objects or anything even remotely expensive, I worry my dev machine is going to play up (because it usually does).

Anything else you'd like to know that might help give clues, please feel free to ask, & thanks for the help.



0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
Do you ever notice anything else crash/hang?  Or does the dev machine not run much else?  It might be hardware, but that is more likely to give apparently random crashes - not always the same line on the same record.

So, if a particular record can always make this line
    $total += $value;
fail, then it's something about the data associated with that record, and this operation.  Are either $total or $value objects with the addition operator overloaded?

A program using 20-30 MB of memory is not to much, and if you were running out of memory, you'd probably notice everything else on the machine running very slow.

In the case where the $total += $value; caused it to hang, did you confirm it was that exact line?  Or just approximately that line?  When it did hang on that line, was it a particular record that caused it to hang, or a particular index in your loop (eg: was it always record 5, or was it alway the 8th iteration of your loop)?
You could do:
    print $LOG "$$: About to do addition\n";
    $total += $value;
    print $LOG "$$: Done with addition\n";
Using dumper to output some data is a lot more complicated, as the dumper module has logic in it that could be causing a problem.  Printing a single line to a log is pretty simple.  Make sure that if you use a log, it is unbuffered, or syswrite.
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Nope no problems with anything except PERL.
In my example of $total += $value, the variables are both normal scalars - no overloading. The exact code that fails really depends on which use case I was trying to run at the time (which loop). Sometimes it really is as simple as the above, sometimes it's a more complex calculation.

& You're right, 20-30MB isn't much at all, and nothing else slows down. It's not an infinite loop.

It's not any given record, if I identify that it hung on the 10th item, but then started the records from number 9 on, number 10 would work fine & it would hang later instead. I haven't confirmed if it's always after a certain number of iterations (eg. record 19 might hang this time around). I'll try to give that a go later (it's pretty tedious debugging as you can imagine!) - however, I know it is different depending on the loop. One loop might hang after handling roughly a few thousand records, another might hang after literally 5.

I started doing things like the code attached to identify exactly which line PERL was giving up at (See example A & B). If I changed the "100 == $index++" to "99 == $index++", both examples would exit fine. (I would find that index 100 hangs usually by trial & error). If I changed it to "101 == ...", it wouldn't get past index 100 so neither example would exit, it would hang before it got a chance.

I went for this croak debugging method because I lost confidence in PERL printing to the browser or a log file. As you've also suggested, I suspected it was related to buffering (definitely in the case of the browser/STDOUT/Apache anyway).

I've never tried using an unbuffered log. Any examples on how to set that up? That would probably make tracing things a fair bit easier.

I'm still quite amazed how it just seems to get stuck up in the air with no complaints or errors, and that the same code in production with a dataset many times larger skips through it like nothing & reports back happily. :/

The only errors I'm getting in Apache log files are warning of "Use of uninitialized value in length at ..//Util.pm line 283." - something to look at but shouldn't be a drama (production is in the same boat).

I'll try unbuffered logging later on.




# A - PERL croaks with expected message

my $index = 0;

for my $record (...)

{

	...

	die "Before offending line: $total, $value" if ( 100 == $index++ );

	$total += $value;	# Normal scalars, no overloading

	...

}
 
 

# B - PERL doesn't exit (hangs)

my $index = 0;

for my $record (...)

{

	...

	$total += $value;	# Normal scalars, no overloading

	die "After offending line: $total, $value" if ( 100 == $index++ );

	...

}

Open in new window

0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
0
 
LVL 39

Expert Comment

by:Adam314
Comment Utility
The code I posted originally turns buffering off using the autoflush variable.  There are several methods, as the links you found demonstrate.
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 1

Author Comment

by:Glauron
Comment Utility
Aah great thanks. Will give it a shot.
Had another IT guy drop into the office this afternoon so we had a chat about it. He's a programmer, but not perl.

He suggested it might actually have something to do with the uninitialised values - if Activestate handles uninitialised stuff differently to the unix build of perl, it might be causing the discrepency. For example, unix-perl might clear up the memory pointed to by the uninitialized variable, set it to undef and then carry on; wheras Activestate might leave it as is, or set it to 0 etc. There might even be residual data not garbage collected properly - maybe from past iterations?

Still if that were the case, you'd expect it to be less consistent, and to happen outside of loops, or still happen in light loops with a high number of iterations.

Something to think about anyway.

0
 
LVL 39

Accepted Solution

by:
Adam314 earned 125 total points
Comment Utility
Variables that are uninitialized in perl will have the value undef.  If you have warnings turned on (with either the "use warnings" pragma, or the -w command line option), perl will give you a warning whenever you use an uninitialized variable.

When you use these, perl will assume a value of 0 in math operations, and a value of empty string for string operations.  Using an uninitialized variable should not cause a hang-up.

Using a log with no buffering should help narrow down exactly which line is causing the problem.  
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Happy days.

I was running a very simple test script which also hung - but since it was maybe a dozen or so lines of code long, with only 1 usage of an object etc. I thought I'd take a crack at properly figuring it out finally.

I did indeed have warnings on, and I turned off the print buffer with a "local $| = 1;". This let me see that it stopped after exactly 424 iterations. I cut out lines of code until it ran all the way through without a hitch. But still couldn't tell what was stopping it exactly, so I thought about the uninitialized values thing & figured It'd redirect STDERR to STDOUT (since I was printing straight ot the browser) to see if it was complaining about uninitialized stuff.

Lo and behold, as soon as I redirected warnings like so:
BEGIN { *STDERR = \*STDOUT };
it zipped straight through the loop. It printed a "Use of uninitialized value" warning on every line, but it ran through all the code with no problems besides that.

Seems to me that Apache (or something) didn't want to handle large volumes of warnings (... even though "large volumes" would only be 424 warnings...) and decided to give up.

So now I finally have a solid lead! Thanks tonnes Adam, I should be right from here.
Will have to spend some time giong through code & making sure I initialize stuff properly, and see if I can find a way to make Apache happy.

0
 
LVL 1

Author Closing Comment

by:Glauron
Comment Utility
Put me on the right track to find the source of the problems. See below.
0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
FYI:

It looks like Apache disliked loads of warnings (stemming from unitialized values in this case). Setting autoflush on ($| = 1) and redirecting warnings to print to STDOUT instead of STDERR (BEGIN { *STDERR = \*STDOUT };)  allowed the program to complete (assumably because Apache was no longer having to deal with the warnings directly).

I'll go figure out the exact cause & post back here when I remember, for the edification of others.  ;)

0
 
LVL 1

Author Comment

by:Glauron
Comment Utility
Yup similar things have been encountered before. Probably very environment/version dependent. Now that I know about it, not a big deal. Plus, it encourages me to be more strict / attentive to my code which isn't a bad thing.

http://www.google.com.au/search?hl=en&q=apache+hangs+after+warnings&meta=&aq=f&oq=

http://mail-archives.apache.org/mod_mbox/httpd-bugs/200308.mbox/%3C20030811180522.1792.qmail@nagoya.betaversion.org%3E

0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now