[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

ASP.Net/C# - Regex question on HTML tag strip

Posted on 2007-03-19
7
Medium Priority
?
808 Views
Last Modified: 2008-02-01
Hello all.  Right now I am stripping out any font tags in a string that stores a HTML string.  I also need to deal with spans it looks like because it can have Font-Size, Font-Family etc.  I am thinking I might have to strip out the entire span tag.  How can I add this to my Regex strip to strip out any Span tags?  Also if you can think of any other ways other HTML only not style sheet that font might come up.  Here is the current function I have to strip the font:

      public static System.String StripFontHtml(System.String Html)
            {
                  Regex ex = new Regex("</?FONT[^<>]+>");

                  Match RegMatch = ex.Match(Html);

                  while (RegMatch.ToString() != "")
                  {
                        Html = Html.Replace(RegMatch.ToString(), "");
                        RegMatch = RegMatch.NextMatch();
                  }

                  return Html;
            }

The better thing also maybe to just strip out the style="" completly if that is possible.  The goal is to not strip all HTML because I want to allow paragraphs and breaks etc. but strip out any font attributes.

<SPAN style="FONT-SIZE: 12pt; FONT-FAMILY: ''''Times New Roman''''"></SPAN>
0
Comment
Question by:sbornstein2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 22

Expert Comment

by:DarkoLord
ID: 18750557
Hi, this regex matches the Span tags containing the "style" element... It returns start tag, contents and end tag, so you should be easily able to replace the start and end tags


(<[^>]*?span[^>]*?(?:style)[^>]*>)((?:.*?(?:<[ \r\t]*span[^>]*>?.*?(?:<.*?/.*?span.*?>)?)*)*)(<[^>]*?/[^>]*?span[^>]*?>)
0
 

Author Comment

by:sbornstein2
ID: 18750770
So I can just add that to my function such as:

public static System.String StripFontHtml(System.String Html)
            {
                  Regex ex = new Regex("</?FONT[^<>]+>");

                  Match RegMatch = ex.Match(Html);

                  while (RegMatch.ToString() != "")
                  {
                        Html = Html.Replace(RegMatch.ToString(), "");
                        RegMatch = RegMatch.NextMatch();
                  }

                  Regex ex = new Regex("<[^>]*?span[^>]*?(?:style)[^>]*>)((?:.*?(?:<[ \r\t]*span[^>]*>?.*?(?:<.*?/.*?span.*?>)?)*)*)(<[^>]*?/[^>]*?span[^>]*?>");

                  Match RegMatch = ex.Match(Html);

                  while (RegMatch.ToString() != "")
                  {
                        Html = Html.Replace(RegMatch.ToString(), "");
                        RegMatch = RegMatch.NextMatch();
                  }
                  return Html;
            }

Does this look like it will work?
0
 

Author Comment

by:sbornstein2
ID: 18750774
I am wondering if I can place it all together for better performance?
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 22

Expert Comment

by:DarkoLord
ID: 18750816
No, this one matches the contents also... This one matches start and end tags only:

(<[^>]*?span[^>]*?(?:style)[^>]*>)(?:(?:.*?(?:<[ \r\t]*span[^>]*>?.*?(?:<.*?/.*?span.*?>)?)*)*)(<[^>]*?/[^>]*?span[^>]*?>)
0
 

Author Comment

by:sbornstein2
ID: 18764403
Sorry Dark for the delay.  I am confused on what you mean by:
"No, this one matches the contents also... This one matches start and end tags only:"
0
 
LVL 22

Accepted Solution

by:
DarkoLord earned 2000 total points
ID: 18764638
Sorry for the confusion... The first regex I gave you matches both tags AND content, however the one in my last post matches only html tags, so the latter is more appropriate for you...
0
 

Author Comment

by:sbornstein2
ID: 18790169
thanks Dark
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the pain points with developing AJAX, JavaScript, JQuery, and other client-side behaviors is that JavaScript doesn’t allow for cross domain request for pulling content. For example, JavaScript code on www.johnchapman.name could not pull conte…
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
Want to learn how to record your desktop screen without having to use an outside camera. Click on this video and learn how to use the cool google extension called "Screencastify"! Step 1: Open a new google tab Step 2: Go to the left hand upper corn…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question