[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Need regex to strip all HTML tags, except these

Posted on 2006-04-17
11
Medium Priority
?
944 Views
Last Modified: 2013-12-03
Hi all,

I need a javascript function that will accept a string of up to 8000 characters, and remove all HTML tags except the following:

<EM></EM>
<STRONG></STRONG>
<U></U>
<em></em>
<strong></strong>
<u></u>

Optionally, the function should also strip attributes from any <P></P> tags, leaving behind the <P> tags without attributes. Example:

<P class=MsoNormal style="MARGIN: 0in 0in 0pt"><SPAN style="FONT-SIZE: 20pt"><U>Test</U></SPAN><FONT size=3> Hello. This is the </FONT><SPAN style="FONT-SIZE: 8pt"><STRONG>example.</STRONG></SPAN></P>

Becomes:

<P><U>Test</U> Hello. This is the <STRONG>example.</STRONG></P>

I'll increase points for a working solution that includes the <p> attributes portion.

Thanks,
SquareHead
0
Comment
Question by:SquareHead
  • 6
  • 5
11 Comments
 
LVL 49

Expert Comment

by:Roonaan
ID: 16470879
Here you go:

<html>
<head>
<script type="text/javascript">
  function stripHtml(str) {
    var newstr = str;
    var regExp = /<\/?(\w+)(.*?)>/ig;
    i = 0;
    while(i++ < 10 && (mt = regExp.exec(str))) {
      oldstr = mt[0];
      tag    = mt[1];
      pars   = mt[2];
      if(tag.match(/em|strong|u|p/i)) {
        repl = oldstr.replace(pars,'');
      } else {
        repl = '';
      }
      newstr = newstr.replace(oldstr, repl);
    }
    return newstr;
  }
</script>
</head>
<body>
<form>
<p>
  <b>In</b>
  <textarea name="myIn" rows="10" cols="60"><P class=MsoNormal style="MARGIN: 0in 0in 0pt"><SPAN style="FONT-SIZE: 20pt"><U>Test</U></SPAN><FONT size=3> Hello. This is the </FONT><SPAN style="FONT-SIZE: 8pt"><STRONG>example.</STRONG></SPAN></P></textarea>
  <br/><input type="button" value="go" onclick="this.form.myText.value=stripHtml(this.form.myIn.value);" />
</p>
<p>
  <b>Out</b>
  <textarea name="myText" rows="10" cols="60"></textarea>
</p>
</form>
</html>

-r-
0
 
LVL 18

Author Comment

by:SquareHead
ID: 16470907
Thanks. Close. Tried it and the SPAN tags remain...
0
 
LVL 49

Expert Comment

by:Roonaan
ID: 16470943
Yes, my second regexp was errorous.

/..|p|../i also matches span. Should have been /^..|p|..$/i

<html>
<head>
<script type="text/javascript">
  function stripHtml(str) {
    var newstr = str;
    var regExp = /<\/?(\w+)(.*?)>/ig;
    i = 0;
    while(i++ < 10 && (mt = regExp.exec(str))) {
      oldstr = mt[0];
      tag    = mt[1];
      pars   = mt[2];
      if(tag.match(/^(em|strong|u|p)$/i)) {
        repl = oldstr.replace(pars,'');
      } else {
        repl = '';
      }
      newstr = newstr.replace(oldstr, repl, "g");
    }
    regexp = null;
    return newstr;
  }
</script>
</head>
<body>
<form>
<p>
  <b>In</b>
  <textarea name="myIn" rows="10" cols="60"><P class=MsoNormal style="MARGIN: 0in 0in 0pt"><SPAN style="FONT-SIZE: 20pt"><U>Test</U></SPAN><FONT size=3> Hello. This is the </FONT><SPAN style="FONT-SIZE: 8pt"><STRONG>example.</STRONG></SPAN></P></textarea>
  <br/><input type="button" value="go" onclick="this.form.myText.value=stripHtml(this.form.myIn.value);" />
</p>
<p>
  <b>Out</b>
  <textarea name="myText" rows="10" cols="60"></textarea>
</p>
</form>
</html>

-r-
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 18

Author Comment

by:SquareHead
ID: 16470977
Thanks Roonaan, we are getting closer. I tested with your example, and the first opening SPAN tag is removed, but the closing /SPAN tag remains, as does any other SPAN pairs...
0
 
LVL 49

Expert Comment

by:Roonaan
ID: 16470991
Did you also re-copy the below line?
newstr = newstr.replace(oldstr, repl, "g");

-r-
0
 
LVL 18

Author Comment

by:SquareHead
ID: 16471002
Example:

<P class=MsoNormal style="MARGIN: 0in 0in 0pt" bla="sdjhhs f dhjfs j sdfhksj"><SPAN style="FONT-SIZE: 20pt"><U>Test</U></SPAN><FONT size=3> Hello. This is the </FONT><SPAN style="FONT-SIZE: 8pt"><STRONG>example.</STRONG></SPAN> <span><em><strong>howdy</strong></em></span></P>


Result:

<P><U>Test</U> Hello. This is the <STRONG>example.</STRONG></SPAN> <span><em><strong>howdy</strong></em></span></P>
0
 
LVL 18

Author Comment

by:SquareHead
ID: 16471009
Yes, that line is in there.
0
 
LVL 49

Accepted Solution

by:
Roonaan earned 2000 total points
ID: 16471088
Pff, I'm being stupid.

Please remove the i++<10 part from the while() line.

-r-
0
 
LVL 18

Author Comment

by:SquareHead
ID: 16471138
Perfect -- thanks!
0
 
LVL 18

Author Comment

by:SquareHead
ID: 16471180
Thanks Roonaan, I increased points from 250 to 500 -- as if you needed them ;-)
0
 
LVL 49

Expert Comment

by:Roonaan
ID: 16471197
Thanx,

Points are nice to get me to my next certification earlier. Even 1000 pts help me work through the 460,390 pts I need for my next certification :-D

-r-
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The task A number given should be formatted for easy reading by separating digits into triads. Format must be made inline via JavaScript, i.e., frameworks / functions are not welcome. So let’s take a number like this “12345678.91¿ and format i…
Having worked on larger scale sites, we found out that you are bound to look at more scalable solutions to integrating widgets, code snippets or complete applications and mesh them into functional sites, in any given composition. To share some of…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question