Avatar of Randall-B
Randall-B

asked on 

preg_replace to Remove Only Certain Rows from HTML Table

An html table stored in a variable as a lot of rows like this, and I want to remove only some of them, depending on what is in the spot that says "Some Words Here":

<TR VALIGN="middle" BGCOLOR="#CCEEFF">
<TD NOWRAP>
<FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#000000">Some Words Here</FONT></TD><TD>
<FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#000000">Blah Blah </FONT></TD><TD NOWRAP align=center>
<FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#CCFFCC">1/1/2005<BR>1:00:01 AM</FONT></TD><TD ALIGN="right" NOWRAP>
<FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#000000">1,234</FONT></TD><td width=48 valign=middle><FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#CC0000"><b>&nbsp;&nbsp;<NOBR>0.00</NOBR></b></FONT></td><TD ALIGN="right" NOWRAP> <b>
<FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#000000"><FONT FACE="Arial,Helvetica,sans-serif" SIZE="2"  COLOR="#CC0000">0.00%</b></TD>
</TR>


Let's say one of the rows has "This is nice" instead of "Some words here."  I tried this regex, but it was ignored:


$table = preg_replace('/<TR VALIGN=\"middle\" BGCOLOR=\"#CCEEFF\"><TD NOWRAP>
<FONT FACE=\"Arial\,Helvetica\,sans-serif\" SIZE=\"2\"  COLOR=\"#000000\">This is nice(.*?)
<\/TR>/is', '', $table);

How can I make this work to delete the row that has "This is nice"?

Also, if I have several specific rows to delete (identified by different key words), would it make sense to explode all the rows into an array and then delete the array elements that contain the given keywords? If so, how?
PHP

Avatar of undefined
Last Comment
Randall-B
ASKER CERTIFIED SOLUTION
Avatar of VoteyDisciple
VoteyDisciple

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of Randall-B
Randall-B

ASKER

VoteyDisciple,
   For some reason I still can't get it to work.  Please copy and paste the exact variable value and the exact regex that you're using.  Then I'll copy and paste exactly from it, to test.

    Also, I'd appreciate a comment on the second part of the question:  If I am going to be inserting only certain rows into a database (after deleting certain rows identified by different key words), would it make sense to explode all the rows into an array and then delete the array elements that contain the given keywords? If so, how?  Or would that just complicate matters; should I just use regex to remove the rows before exploding into an array? Thanks.
Avatar of Randall-B
Randall-B

ASKER

Never mind, I got the regex working, but I'd still appreciate your comments on the second part. Thanks.
Avatar of Randall-B
Randall-B

ASKER

I found a way to get around the problem of unusual spaces, tab characters, carriage ret urns, and  line breaks in the html code is to remove all kinds of white space between, before, and after html tags at the beginning, before ever using the regex to remove individual rows.  For example:

FIRST:
   $table = preg_replace('/>\s+</', '><', $table);
   $table = preg_replace('/>\s+/', '>', $table);
   $table = preg_replace('/\s+</', '<', $table);

THEN: Delete the desired row, using a less-specific regex to match only the keyword, like:
 
   $table = preg_replace('/<TR(.*?)This is nice(.*?)<\/TR>/is', '', $table);

Actually, because the regex is so general, the prior removal of all white space between, after, and before the html tags probably is not needed and could slow down the script a little.  
     (Also, one would have to make sure that removing all those white spaces does not foul up the visible html in other rows, depending on the layout of the particular html table:  e.g., the space between a word and a hyperlink should not be removed if needed for the visible page, or the sapce between "<b>this is bold</b>  <i>but</i>this is italic</i>".)
     Since VoteyDisciple's answer helped me think about the solution of making the regex much more general, I'm accepting his answer, although I think this latest approach is best.  Thanks.
Avatar of Randall-B
Randall-B

ASKER

And, regarding the second part of the question, the way to find and remove an element of an array (based on whether it contains a particular word), is explained at http:Q_20096945.html . It involves  
   array_search()   and     array_splice
PHP
PHP

PHP is a widely-used server-side scripting language especially suited for web development, powering tens of millions of sites from Facebook to personal WordPress blogs. PHP is often paired with the MySQL relational database, but includes support for most other mainstream databases. By utilizing different Server APIs, PHP can work on many different web servers as a server-side scripting language.

125K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo