Avatar of seanhess
seanhess

asked on 

PHP pcre Regular Expression max characters? -- regex, preg_match

PHP's preg_match seems to be failing when the subject has too many characters in it.  

This is my expression::
   preg_match('/(.*<div id="storeArea">\s*)(.*)(\s*<\/div>\s*<!--POST-BODY-START-->.*)/si', $subject, $regs)

And here is the subject
<html>
<body>
<div id="storeArea">
<div>aaa ... </div>
<div>aaa ... </div>
<div>aaa ... </div>
</div>
<!--POST-BODY-START-->
<!--POST-BODY-END-->
      </body>
</html>

It will match fine on that, but if the aaa .... is made HUGELY long, it won't match any more.  I tested it with 80,000 a's in each line.  If there was only one line (div tag) of A's, it would match, but it wouldn't match 3 lines of a's.

Is there a character limit to preg_match?  Why would it behave like this?  Can I fix the regular expression?

Thanks!

We're sending info to a php script.  It is supposed to match a regular expression against the data.
Regular ExpressionsPHP

Avatar of undefined
Last Comment
Steve Bink
ASKER CERTIFIED SOLUTION
Avatar of Steve Bink
Steve Bink
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of seanhess
seanhess

ASKER

Hmm... I still don't understand why it worked in Perl but not PHP, but I used a split, and then performed a regex only on the second half.  

This seems to work ... I hope we don't run into problems with the second half!
Avatar of Steve Bink
Steve Bink
Flag of United States of America image

Well, PCRE stands for Perl COMPATIBLE Regular Expressions, so I'm guessing it it missing a few features/functions found in the original.  I've only taken a brief look at the actual PerlRE docs, and it looks much more painful overall.  Perhaps you can write the routine in Perl and call it from PHP?
Avatar of seanhess
seanhess

ASKER

That's alright.. I did get it working.  Once I knew the size was the problem, it wasn't hard to fix.

> "Perhaps you can write the routine in Perl and call it from PHP?"
No way... that would be way more work than it is worth.  I'll stick with the split.. it only added one line.
Avatar of Steve Bink
Steve Bink
Flag of United States of America image

Don't fix what ain't broke, right?  :)  Good luck to you, and thanks for the points!
PHP
PHP

PHP is a widely-used server-side scripting language especially suited for web development, powering tens of millions of sites from Facebook to personal WordPress blogs. PHP is often paired with the MySQL relational database, but includes support for most other mainstream databases. By utilizing different Server APIs, PHP can work on many different web servers as a server-side scripting language.

125K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo