Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

RegEx.Replace c#

Posted on 2006-11-16
7
Medium Priority
?
917 Views
Last Modified: 2010-05-18
hi guys,
im trying to parse some html without success...
what i got is a string that holds an html.

i need to remove DIVS from it, but only if the div contains an image
with a specific src.

for example: this entire div should be replaced with an empty string...
<div><img src="bad.gif"></div>

and this should stay as is.
<div><img src="good.gif"></div>


what i was trying to do is using the following pattern :
<div.*bad.gif.*</div>

im new with regex so be gentle...(:
thx!
0
Comment
Question by:tsabbay
  • 3
  • 2
  • 2
7 Comments
 
LVL 85

Expert Comment

by:ozo
ID: 17964031
<div((?!</div>).)*bad.gif.*?</div>
0
 
LVL 18

Expert Comment

by:Ravi Singh
ID: 17964119
Hi, this method should strip out the div tags based on the src string you provide:

private string RemoveDivTagsBySrc(string html, string src)
{
      return Regex.Replace(html, "<div[^>]*>(.*?)<img(.*?)src=\"" + src + "\"[^>]*>(.*?)</div>", string.Empty, RegexOptions.IgnoreCase);
}

Usage:

string sampleHtml = "<div><img src=\"bad.gif\"></div>" + "\n" + "<div><img src=\"good.gif\"></div>" + "\n" + "<div><img src=\"bad.gif\"></div>";

string newHtml = this.RemoveDivTagsBySrc(sampleHtml, "bad.gif");

//newHtml string should now only contain the div tag with "good.gif" as the src
0
 
LVL 85

Accepted Solution

by:
ozo earned 1000 total points
ID: 17964138
That would remove the entirety of
"<div><img src=\"good.gif\"></div>  <div><img src=\"bad.gif\"></div>"
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 

Author Comment

by:tsabbay
ID: 17964201
hi guys,
thank you all for your reply...while waiting i wrote a small string search and replace...
i checked both solutions and both not doing it right, sorry.

benny.
0
 
LVL 85

Expert Comment

by:ozo
ID: 17964255
what are they not doing right?  If the div can include newlines, you should use RegexOptions.Singleline
bad.gif should really be bad\\.gif or bad[.]gif
0
 
LVL 18

Expert Comment

by:Ravi Singh
ID: 17964273
ozo's right, my regex gets greedy and matches all the div tags, one way of using his regex is shown below:

(PLEASE ACCEPT THE SOLUTION BY OZO IF THIS WORKS FOR YOU)



private string RemoveDivTagsBySrc(string html, string src)
{
      return Regex.Replace(html, "<div((?!</div>).)*" + src + ".*?</div>", string.Empty, RegexOptions.IgnoreCase);
}

use:

string sampleHtml = "<div><img src=\"good.gif\"></div><div><img src=\"bad.gif\"></div>";
string newHtml = this.RemoveDivTagsBySrc(sampleHtml, "bad.gif");
0
 

Author Comment

by:tsabbay
ID: 17965251
i dnt know why...but its not just removing the relevant texts..its also removes some other divs closing tags from the html.

beside, my custom codes seems to work much faster then the regex so im dropping the usage.

thank you all for your time!
a credit will be givven to OZO.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
This article will show how Aten was able to supply easy management and control for Artear's video walls and wide range display configurations of their newsroom.
Simple Linear Regression
Introduction to Processes

876 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question