Solved

Matching HTML for replacing with regexp.

Posted on 2009-04-07
2
167 Views
Last Modified: 2012-05-06
Hello,

I'm having some trouble doing some specific matching on a string containing a complete HTML page.

The case is as follows: Given specific headings, I am to find those headings and remove them and the text below them.

So far I've got the matching of the headings working nicely. The problem comes when I'm looking to match the text below them. I'm having trouble making it stop as it were.

My idea is to look for the next <h#> tag and match to it. However, it doesn't stop at the *next* tag, it stops at the *last* one, and thus the script removes a lot more than it should. How do I prevent this?
$needle = '/<h'.$overskrift['Level'].'> <span class="mw-headline">'.str_replace('/', '\/', $overskrift['Heading']).'<\/span><\/h'.$overskrift['Level'].'>.*(<h\d>)/s';
 

// Example value of $needle: /<h2> <span class="mw-headline">Heading<\/span><\/h2>.*(<h\d>)/s

// Works nicely up till the dot.
 

$res = preg_replace($needle, "$1", $res);

Open in new window

0
Comment
Question by:Elisas
2 Comments
 
LVL 18

Accepted Solution

by:
Hube02 earned 250 total points
ID: 24086517
the problem is that preg funcntions are gready and will match as much as they can. you can turn off this greadyness by adding a ?

.*?(<h\d>)

Let me know if this works, if not then we will try a lookahead here.
0
 

Author Closing Comment

by:Elisas
ID: 31567458
Superb. That did the trick.
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now