Solved

RegEx.Replace c#

Posted on 2006-11-16
7
904 Views
Last Modified: 2010-05-18
hi guys,
im trying to parse some html without success...
what i got is a string that holds an html.

i need to remove DIVS from it, but only if the div contains an image
with a specific src.

for example: this entire div should be replaced with an empty string...
<div><img src="bad.gif"></div>

and this should stay as is.
<div><img src="good.gif"></div>


what i was trying to do is using the following pattern :
<div.*bad.gif.*</div>

im new with regex so be gentle...(:
thx!
0
Comment
Question by:tsabbay
  • 3
  • 2
  • 2
7 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 17964031
<div((?!</div>).)*bad.gif.*?</div>
0
 
LVL 18

Expert Comment

by:Ravi Singh
ID: 17964119
Hi, this method should strip out the div tags based on the src string you provide:

private string RemoveDivTagsBySrc(string html, string src)
{
      return Regex.Replace(html, "<div[^>]*>(.*?)<img(.*?)src=\"" + src + "\"[^>]*>(.*?)</div>", string.Empty, RegexOptions.IgnoreCase);
}

Usage:

string sampleHtml = "<div><img src=\"bad.gif\"></div>" + "\n" + "<div><img src=\"good.gif\"></div>" + "\n" + "<div><img src=\"bad.gif\"></div>";

string newHtml = this.RemoveDivTagsBySrc(sampleHtml, "bad.gif");

//newHtml string should now only contain the div tag with "good.gif" as the src
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 17964138
That would remove the entirety of
"<div><img src=\"good.gif\"></div>  <div><img src=\"bad.gif\"></div>"
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:tsabbay
ID: 17964201
hi guys,
thank you all for your reply...while waiting i wrote a small string search and replace...
i checked both solutions and both not doing it right, sorry.

benny.
0
 
LVL 84

Expert Comment

by:ozo
ID: 17964255
what are they not doing right?  If the div can include newlines, you should use RegexOptions.Singleline
bad.gif should really be bad\\.gif or bad[.]gif
0
 
LVL 18

Expert Comment

by:Ravi Singh
ID: 17964273
ozo's right, my regex gets greedy and matches all the div tags, one way of using his regex is shown below:

(PLEASE ACCEPT THE SOLUTION BY OZO IF THIS WORKS FOR YOU)



private string RemoveDivTagsBySrc(string html, string src)
{
      return Regex.Replace(html, "<div((?!</div>).)*" + src + ".*?</div>", string.Empty, RegexOptions.IgnoreCase);
}

use:

string sampleHtml = "<div><img src=\"good.gif\"></div><div><img src=\"bad.gif\"></div>";
string newHtml = this.RemoveDivTagsBySrc(sampleHtml, "bad.gif");
0
 

Author Comment

by:tsabbay
ID: 17965251
i dnt know why...but its not just removing the relevant texts..its also removes some other divs closing tags from the html.

beside, my custom codes seems to work much faster then the regex so im dropping the usage.

thank you all for your time!
a credit will be givven to OZO.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Here we come across an interesting topic of coding guidelines while designing automation test scripts. The scope of this article will not be limited to QTP but to an overall extent of using VB Scripting for automation projects. Introduction Now…
Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now