Matching HTML for replacing with regexp.

Posted on 2009-04-07
Medium Priority
Last Modified: 2012-05-06

I'm having some trouble doing some specific matching on a string containing a complete HTML page.

The case is as follows: Given specific headings, I am to find those headings and remove them and the text below them.

So far I've got the matching of the headings working nicely. The problem comes when I'm looking to match the text below them. I'm having trouble making it stop as it were.

My idea is to look for the next <h#> tag and match to it. However, it doesn't stop at the *next* tag, it stops at the *last* one, and thus the script removes a lot more than it should. How do I prevent this?
$needle = '/<h'.$overskrift['Level'].'> <span class="mw-headline">'.str_replace('/', '\/', $overskrift['Heading']).'<\/span><\/h'.$overskrift['Level'].'>.*(<h\d>)/s';
// Example value of $needle: /<h2> <span class="mw-headline">Heading<\/span><\/h2>.*(<h\d>)/s
// Works nicely up till the dot.
$res = preg_replace($needle, "$1", $res);

Open in new window

Question by:Elisas
LVL 18

Accepted Solution

Hube02 earned 1000 total points
ID: 24086517
the problem is that preg funcntions are gready and will match as much as they can. you can turn off this greadyness by adding a ?


Let me know if this works, if not then we will try a lookahead here.

Author Closing Comment

ID: 31567458
Superb. That did the trick.

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

There are times when I have encountered the need to decompress a response from a PHP request. This is how it's done, but you must have control of the request and you can set the Accept-Encoding header.
This holiday season, we’re giving away the gift of knowledge—tech knowledge, that is. Keep reading to see what hacks, tips, and trends we have wrapped and waiting for you under the tree.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

624 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question