PHP preg_replace BLANK PAGE (Too many results)...

Posted on 2012-09-02
Last Modified: 2012-09-02
Hi, I use preg_replace, but when there are to many results, it works really randomly. Sometimes it displays data, sometimes it doesn't.

$tableBody     = preg_replace("#^.*<tbody>(.*?)</tbody>.*$#is", "$1", $htm);

When there are 2000 results - $tableBody = nothing even with:

      ini_set('pcre.backtrack_limit', 10000000);
      ini_set('pcre.backtrack_limit', 10000000);

Sometimes it works with the same amount (2000 matches, each match consists of 9 lines), but sometimes it doesn't... how to fix this?
Question by:VAL1N
    LVL 107

    Expert Comment

    by:Ray Paseur
    Please post the link to the original test data where you've encountered an issue.  There are other ways to scrape this data out, that do not require putting a lot of stress on the REGEX engine.

    Author Comment

    by:VAL1N - DATA.

    $tableBody     = preg_replace("#^.*<tbody>(.*?)</tbody>.*$#is", "$1", $htm);
    preg_match_all("#<tr>(?:(?!</tr>).)*?<td>(.*?)</td>(?:(?!</tr>).)*?<td>(.*?)</td>(?:(?!</tr>).)*?<td>(.*?)</td>(?:(?!</tr>).)*?<td>(.*?)</td>(?:(?!</tr>).)*?<td>(.*?)</td>(?:(?!</tr>).)*?<td>(.*?)</td>#si", $tableBody, $matches);

                                  foreach ($matches[1] as $num => $value) {
                                                          $tmp_date             = $matches[3][$num];
                                                          $game_master_original = $matches[2][$num];
    LVL 34

    Accepted Solution

    The pattern you're using has a greedy .* that will mean you only get a single match (and replace). However, your test data only has the only pair of tbody tags, so that is likely what you want.

    Using this pattern might be more efficient & reliable, and should achieve the same result:

    $tableBody     = preg_replace("#^.*?<tbody>|</tbody>.*$#is", "", $htm);

    Open in new window

    However, I'm not particularly experienced with dealing with large quantities of data so there may be an even better way.
    LVL 107

    Expert Comment

    by:Ray Paseur

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    Suggested Solutions

    Flask is a microframework for Python based on Werkzeug and Jinja 2. This requires you to have a good understanding of Python 2.7. Lets install Flask! To install Flask you can use a python repository for libraries tool called pip. Download this f…
    Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
    Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
    The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now