Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Regular Express to find empty <p> tags using PHP

Posted on 2012-03-17
4
Medium Priority
?
495 Views
Last Modified: 2012-03-18
Hello,

I've never been real good with regular expressions...maybe someone can help.  I need to find all instances of empty <p></p> tags in my content and replace with <p>&nbsp;</p> so they correctly do something (the style in the css file has margin:0; padding:0 for p and with nothing in them, they end up not giving even a line break.

In that easy case, I'm using str_ireplace("<p></p>","<p>&nbsp;</p>",$data);

The problem comes in when there is an empty <p> tag that includes some styling info that's coming from the TinyMCE editor, such as

<p style="text-align: center;"></p>

I still need to be able to find these.  The other problem is there may be white space in between the tags such as:

<p style="text-align: center;">    </p>

which also don't display anything in the browser.

So, my goal would be to find all <p tags with any number of additional attributes on the tag, followed by a closing bracket > followed by any number of whitespace characters (or at least spaces) followed by a closing </p> tag and replace with the same opening tag, &nbsp;</p>.

For example:
<p></p> ==> <p>&nbsp;</p>
<p>  </p> ==> <p>&nbsp;</p>
<p style="text-align: center;"></p> ==> <p style="text-align: center;">&nbsp;</p>

There may be "n" number of these situations in any data and they will be dispersed throughout the data as this is the content portion of a webpage, so there may be lots of paragraphs within a full page.

Can anybody help with this?
0
Comment
Question by:garyhoffmann
  • 2
4 Comments
 
LVL 84

Expert Comment

by:Dave Baldwin
ID: 37733803
You don't need regex.  You have two conditions and netiher requires you to locate the opening tag.  Do a replace instead for '></p>' and '> </p>',  In the second one, even if it is part of a longer string, it doesn't matter because you are replacing a space with a different space.
str_ireplace("></p>",">&nbsp;</p>",$data);
str_ireplace("> </p>",">&nbsp;</p>",$data);

Open in new window


That should take care of both conditions.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 2000 total points
ID: 37734028
@DaveBaldwin, that would also add non-breaking spaces after other tags, like this:

<p><img src="foo.jpg"></p>

My view is that a regex is justified. I'd suggest:

preg_replace("#(<p[^>]*>)\s*(</p>)#i","$1&nbsp;$2",$data);

Open in new window

0
 

Author Comment

by:garyhoffmann
ID: 37734143
@TerryAtOpus - this seems to be the exact solution and seems to be working perfectly.  I'm wondering if you can do one more thing for me if you have time - would you please explain the regex.  I'm desperately trying to understand regex.

If I understand this correctly, you are using # as the delimiter, then you are testing for <p followed by any number of anything other than > (by the use of [^>]*) followed by a close bracket > and by surrounding this in parens, you are getting it back as $1

Then, you are looking for any number of whitespace characters denoted by \s* then the closing p tag (</p>) which also puts it into $2 by the use of parens

Finally, the "i" after the delimiter at the end says case insensitive

Do I have this correct?
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 37735407
You've got it exactly right. A more common pattern delimiter is the / character, but that's not so good for patterns which have / in them as it means extra escaping.

The * in:
[^>]*
also matches 0 characters, but is greedy so will match as many as possible. There's a good cheat sheet here: http://download-my-brain.wikispaces.com/Computing+-+Regular+Expressions

Thanks for the points
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

These days socially coordinated efforts have turned into a critical requirement for enterprises.
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to count occurrences of each item in an array.
Suggested Courses

824 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question