Solved

Regular Express to find empty <p> tags using PHP

Posted on 2012-03-17
4
478 Views
Last Modified: 2012-03-18
Hello,

I've never been real good with regular expressions...maybe someone can help.  I need to find all instances of empty <p></p> tags in my content and replace with <p>&nbsp;</p> so they correctly do something (the style in the css file has margin:0; padding:0 for p and with nothing in them, they end up not giving even a line break.

In that easy case, I'm using str_ireplace("<p></p>","<p>&nbsp;</p>",$data);

The problem comes in when there is an empty <p> tag that includes some styling info that's coming from the TinyMCE editor, such as

<p style="text-align: center;"></p>

I still need to be able to find these.  The other problem is there may be white space in between the tags such as:

<p style="text-align: center;">    </p>

which also don't display anything in the browser.

So, my goal would be to find all <p tags with any number of additional attributes on the tag, followed by a closing bracket > followed by any number of whitespace characters (or at least spaces) followed by a closing </p> tag and replace with the same opening tag, &nbsp;</p>.

For example:
<p></p> ==> <p>&nbsp;</p>
<p>  </p> ==> <p>&nbsp;</p>
<p style="text-align: center;"></p> ==> <p style="text-align: center;">&nbsp;</p>

There may be "n" number of these situations in any data and they will be dispersed throughout the data as this is the content portion of a webpage, so there may be lots of paragraphs within a full page.

Can anybody help with this?
0
Comment
Question by:garyhoffmann
  • 2
4 Comments
 
LVL 82

Expert Comment

by:Dave Baldwin
ID: 37733803
You don't need regex.  You have two conditions and netiher requires you to locate the opening tag.  Do a replace instead for '></p>' and '> </p>',  In the second one, even if it is part of a longer string, it doesn't matter because you are replacing a space with a different space.
str_ireplace("></p>",">&nbsp;</p>",$data);
str_ireplace("> </p>",">&nbsp;</p>",$data);

Open in new window


That should take care of both conditions.
0
 
LVL 35

Accepted Solution

by:
Terry Woods earned 500 total points
ID: 37734028
@DaveBaldwin, that would also add non-breaking spaces after other tags, like this:

<p><img src="foo.jpg"></p>

My view is that a regex is justified. I'd suggest:

preg_replace("#(<p[^>]*>)\s*(</p>)#i","$1&nbsp;$2",$data);

Open in new window

0
 

Author Comment

by:garyhoffmann
ID: 37734143
@TerryAtOpus - this seems to be the exact solution and seems to be working perfectly.  I'm wondering if you can do one more thing for me if you have time - would you please explain the regex.  I'm desperately trying to understand regex.

If I understand this correctly, you are using # as the delimiter, then you are testing for <p followed by any number of anything other than > (by the use of [^>]*) followed by a close bracket > and by surrounding this in parens, you are getting it back as $1

Then, you are looking for any number of whitespace characters denoted by \s* then the closing p tag (</p>) which also puts it into $2 by the use of parens

Finally, the "i" after the delimiter at the end says case insensitive

Do I have this correct?
0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 37735407
You've got it exactly right. A more common pattern delimiter is the / character, but that's not so good for patterns which have / in them as it means extra escaping.

The * in:
[^>]*
also matches 0 characters, but is greedy so will match as many as possible. There's a good cheat sheet here: http://download-my-brain.wikispaces.com/Computing+-+Regular+Expressions

Thanks for the points
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
These days socially coordinated efforts have turned into a critical requirement for enterprises.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now