Link to home
Start Free TrialLog in
Avatar of Glauron
Glauron

asked on

Activestate PERL hangs

Been having problems with PERL on my Windows Vista dev machine for some time now. It's a nuisance but I can normally get around it. Now I'm finally sick of it.

Can't figure it out, so looking for any outside-of-the-box ideas that might help me find the source of the problem. Of course, if anyone actually knows the solution, even better!

When I run code in a loop, especially loops with higher iterations or intense object oriented flavoured code within its scope, I occasionally have PERL lock up on me. Actually "occasionally" implies inconsistence, but if I run into a jam, it will always jam until I do something about it (limit the iterations for example) which is why I say I can "get around it" and that "it's a nusiance". The same code in production (Solaris 10, SunOS 5.10) runs perfectly, only on my dev machine do I have trouble.

Was running PERL 5.8.8 build 820 but upgraded to 5.8.9 build 825 in an attempt to fix it with no luck. Production PERL is " v5.8.8 built for i86pc-solaris-64" .

So, some symptoms:
- To my knowledge, has only ever hung inside of a loop (except for one possible exception tonight).
- The loops have always been iterating through database results (but then I very rarely work with other loops)
- Browser is eternally waiting for a response (no headers passed back)
- The PERL process stays in task manager, and the 'memory' usage always stops still (no fluctuating) and no CPU time is taken. Memory usage seems to stop anywhere from 20MB to 30MB.

The thing that bugs me though is that it doesn't just die & throw an error, it just gives up & sits there. Tongiht I'm pretty certain it actually made it out of the loop before giving up too, because the database transaction that enclosed the loop was committed safely, unless the peculiarity I'm experiencing has an accomplice, which is MySQL.

Normally I have to kill the PERL process (which then lands me with a "page cannot be displayed" in IE) then limit the database loop to 1 result (or maybe a lot more, depending on the intensity of the enclosed code).

I'm leaning towards thinking it's a memory problem, perhaps a limit defined within the dev environment - Apache or maybe PERL itself (but darned if I can find it)  -. Couldn't tell you why, it just smells of it. Probably because I'm intuitively realising there's a fair bit going on normally. Either looping thousands of times over simple objects and method calls, or (less) times over heavy object usage including looping through child objects thereafter. Somehow I think the overhead of each object is getting to it. But the load is NOT massive. It's a test environment, with a database restricted to a very bare-boned data set a fraction of the size of that seen in production, yet production never skips a beat? Neither environment uses mod_perl.

But I digress...
Any ideas? You're allowed to be (realistically) imaginitive in suggestions.

I've included a completely contrived sampling of code that might experience the problem, soley intended to reveal my approach. Maybe there's a way I could vastly optimise my logic..?

# FLAG FEES TO CHASE UP
# Find all suspended buyers of product with outstanding invoices where they
# have a valid bank account and a record of paying in the past, and keep track
# of the fees on these outstanding invoices <for some reason>
#
use Database;
use WatchList;
use Entity;
 
my $db = new Database();
 
# Try to find fees & add to watchlist (wrap in DB transaction)
$db->startTransaction('Example');
eval
{
	for my $entityId (
			map { $_->[0] }
			@{
				$db->all("select e.id from entity
					inner join role as r on r.entity_id=e.id
					inner join statement as s on s.entity_id=e.id
					where r.type='buyer' and
					s.type='invoice' and s.paid='1'"
				)
			}
		)
	{
		my $buyer = new Entity( id => $entityId );
	
		# Only interested in suspended buyers
		next if (! $buyer->isSuspended() );
	
		# Get buyer's preffered debit account & check that it appears valid
		my $account = $buyer->getPreferredAccount( role => 'debit' );
		next if (! $account->isValid() );
	
		# Find all UNPAID FEES on outstanding INVOICES owned by this buyer
		my @fees;
		for my $statement ( grep { $_->isInvoice() }, $buyer->getOutstandingStatements() )
		{
			push @fees, grep { !$_->isPaid() }, $statement->getFees();
		}
	
		# Create a watchlist attached to this buyer, overseeing fees
		# for which we need to chase payment
		my $watchList = new WatchList( associate => $buyer );
		map { $watchList->add($_) } @fees;
	}
 
	# All done, try to commit changes!
	$db->commit('Example');
};
 
# Catch any errors
if ( $@ )
{
	# Uh oh.
	$db->rollback('Example');
	croak "Could not add fees to watch list: $@";
}
 
# Print results ... etc. ?

Open in new window

Avatar of Adam314
Adam314

Well, if it made it out of the loop, then you aren't showing the code that caused the problem.  

I'm assuming that the Database, WatchList, and Entity are all modules you've developed.  You didn't post the code there, so can't tell what they are doing.

You might try creating a log, and have it log what it's doing continually.  Something like this.  Then when it locks up, look in the log to see exactly where it got.  That might help you determine where it is hanging.

#At the top of your script, open log, unbuffer output to log
open(my $LOG, ">>c:/temp/log.txt") or die "Log: $!\n";
flock($LOG, LOCK_EX);
select((select($LOG), $| = 1)[0]);
 
#Then your normal code, print to log
print $LOG "$$:Creating Database object: begin";
my $db = new Database();
print $LOG "$$:Creating Database object: done\n";
 
...
print $LOG "$$:Getting all entity id's: begin\n";
for my $entityId (
  map { $_->[0] }
    @{
      $db->all("select e.id from entity
                inner join role as r on r.entity_id=e.id
                inner join statement as s on s.entity_id=e.id
                where r.type='buyer' and
                s.type='invoice' and s.paid='1'"
               )
    }
  )
  {
    print $LOG "$$:Processing entity ID $entityId\n";

Open in new window

Avatar of Glauron

ASKER

Nah the code above I made up to kind show the general makeup of a typical loop that might fail. WatchList doesn't exist (although Database & Entity do).

I've already tried logging statements and would even be able to find exactly which iteration would hang, and on which line (or even which line of an object's method called during that loop). There is never anything abnormal about where it would hang. For example, the 'offending' line of code might be a sum, like a $total += $value; or some such. And I've dumped (Data::Dumper) everything currently being looked at just before it hangs & there's nothing strange about anything there either. If Imove the dumping/croaking statement to the other side of the 'offending' line, it never makes it there.

In fact, if it's the 'n'th set of data being looked at that fails, if I go back select only that records or the records around it, it works no problems. If I START from lets say n-10, record 'n' will work, but it will hang further down the line.

"Well, if it made it out of the loop, then you aren't showing the code that caused the problem"
That's pretty much exactly the problem :)
If there are only a handful of records it's looking at, it will throw the expected page up in seconds, but add some more records, and it will hang "somewhere". And by the way, this isn't just any on particular given loop, whenever I'm writing a loop that uses objects or anything even remotely expensive, I worry my dev machine is going to play up (because it usually does).

Anything else you'd like to know that might help give clues, please feel free to ask, & thanks for the help.



Do you ever notice anything else crash/hang?  Or does the dev machine not run much else?  It might be hardware, but that is more likely to give apparently random crashes - not always the same line on the same record.

So, if a particular record can always make this line
    $total += $value;
fail, then it's something about the data associated with that record, and this operation.  Are either $total or $value objects with the addition operator overloaded?

A program using 20-30 MB of memory is not to much, and if you were running out of memory, you'd probably notice everything else on the machine running very slow.

In the case where the $total += $value; caused it to hang, did you confirm it was that exact line?  Or just approximately that line?  When it did hang on that line, was it a particular record that caused it to hang, or a particular index in your loop (eg: was it always record 5, or was it alway the 8th iteration of your loop)?
You could do:
    print $LOG "$$: About to do addition\n";
    $total += $value;
    print $LOG "$$: Done with addition\n";
Using dumper to output some data is a lot more complicated, as the dumper module has logic in it that could be causing a problem.  Printing a single line to a log is pretty simple.  Make sure that if you use a log, it is unbuffered, or syswrite.
Avatar of Glauron

ASKER

Nope no problems with anything except PERL.
In my example of $total += $value, the variables are both normal scalars - no overloading. The exact code that fails really depends on which use case I was trying to run at the time (which loop). Sometimes it really is as simple as the above, sometimes it's a more complex calculation.

& You're right, 20-30MB isn't much at all, and nothing else slows down. It's not an infinite loop.

It's not any given record, if I identify that it hung on the 10th item, but then started the records from number 9 on, number 10 would work fine & it would hang later instead. I haven't confirmed if it's always after a certain number of iterations (eg. record 19 might hang this time around). I'll try to give that a go later (it's pretty tedious debugging as you can imagine!) - however, I know it is different depending on the loop. One loop might hang after handling roughly a few thousand records, another might hang after literally 5.

I started doing things like the code attached to identify exactly which line PERL was giving up at (See example A & B). If I changed the "100 == $index++" to "99 == $index++", both examples would exit fine. (I would find that index 100 hangs usually by trial & error). If I changed it to "101 == ...", it wouldn't get past index 100 so neither example would exit, it would hang before it got a chance.

I went for this croak debugging method because I lost confidence in PERL printing to the browser or a log file. As you've also suggested, I suspected it was related to buffering (definitely in the case of the browser/STDOUT/Apache anyway).

I've never tried using an unbuffered log. Any examples on how to set that up? That would probably make tracing things a fair bit easier.

I'm still quite amazed how it just seems to get stuck up in the air with no complaints or errors, and that the same code in production with a dataset many times larger skips through it like nothing & reports back happily. :/

The only errors I'm getting in Apache log files are warning of "Use of uninitialized value in length at ..//Util.pm line 283." - something to look at but shouldn't be a drama (production is in the same boat).

I'll try unbuffered logging later on.



# A - PERL croaks with expected message
my $index = 0;
for my $record (...)
{
	...
	die "Before offending line: $total, $value" if ( 100 == $index++ );
	$total += $value;	# Normal scalars, no overloading
	...
}
 
 
# B - PERL doesn't exit (hangs)
my $index = 0;
for my $record (...)
{
	...
	$total += $value;	# Normal scalars, no overloading
	die "After offending line: $total, $value" if ( 100 == $index++ );
	...
}

Open in new window

The code I posted originally turns buffering off using the autoflush variable.  There are several methods, as the links you found demonstrate.
Avatar of Glauron

ASKER

Aah great thanks. Will give it a shot.
Had another IT guy drop into the office this afternoon so we had a chat about it. He's a programmer, but not perl.

He suggested it might actually have something to do with the uninitialised values - if Activestate handles uninitialised stuff differently to the unix build of perl, it might be causing the discrepency. For example, unix-perl might clear up the memory pointed to by the uninitialized variable, set it to undef and then carry on; wheras Activestate might leave it as is, or set it to 0 etc. There might even be residual data not garbage collected properly - maybe from past iterations?

Still if that were the case, you'd expect it to be less consistent, and to happen outside of loops, or still happen in light loops with a high number of iterations.

Something to think about anyway.

ASKER CERTIFIED SOLUTION
Avatar of Adam314
Adam314

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Glauron

ASKER

Happy days.

I was running a very simple test script which also hung - but since it was maybe a dozen or so lines of code long, with only 1 usage of an object etc. I thought I'd take a crack at properly figuring it out finally.

I did indeed have warnings on, and I turned off the print buffer with a "local $| = 1;". This let me see that it stopped after exactly 424 iterations. I cut out lines of code until it ran all the way through without a hitch. But still couldn't tell what was stopping it exactly, so I thought about the uninitialized values thing & figured It'd redirect STDERR to STDOUT (since I was printing straight ot the browser) to see if it was complaining about uninitialized stuff.

Lo and behold, as soon as I redirected warnings like so:
BEGIN { *STDERR = \*STDOUT };
it zipped straight through the loop. It printed a "Use of uninitialized value" warning on every line, but it ran through all the code with no problems besides that.

Seems to me that Apache (or something) didn't want to handle large volumes of warnings (... even though "large volumes" would only be 424 warnings...) and decided to give up.

So now I finally have a solid lead! Thanks tonnes Adam, I should be right from here.
Will have to spend some time giong through code & making sure I initialize stuff properly, and see if I can find a way to make Apache happy.

Avatar of Glauron

ASKER

Put me on the right track to find the source of the problems. See below.
Avatar of Glauron

ASKER

FYI:

It looks like Apache disliked loads of warnings (stemming from unitialized values in this case). Setting autoflush on ($| = 1) and redirecting warnings to print to STDOUT instead of STDERR (BEGIN { *STDERR = \*STDOUT };)  allowed the program to complete (assumably because Apache was no longer having to deal with the warnings directly).

I'll go figure out the exact cause & post back here when I remember, for the edification of others.  ;)

Avatar of Glauron

ASKER

Yup similar things have been encountered before. Probably very environment/version dependent. Now that I know about it, not a big deal. Plus, it encourages me to be more strict / attentive to my code which isn't a bad thing.

http://www.google.com.au/search?hl=en&q=apache+hangs+after+warnings&meta=&aq=f&oq=

http://mail-archives.apache.org/mod_mbox/httpd-bugs/200308.mbox/%3C20030811180522.1792.qmail@nagoya.betaversion.org%3E