Solved

Regular Expression to remove JavaScript / CSS from HTML source

Posted on 2006-10-29
8
407 Views
Last Modified: 2013-11-19
So far I have a regex that I use to strip the HTML tags from a page however this doesnt work correctly with CSS and JavaScript...

Im looking for a regular expression to remove script (javascript, etc) and styles from the html source i have in a local string variable

examples of what i need to remove:

[style type="text/css"] blah [/style]
[style] blah [/style]
[script language="JavaScript"] blah [/script]
[script type="text/javascript"] blah [/script]

is this possible w/ regexp?
0
Comment
Question by:mcainc
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
8 Comments
 

Author Comment

by:mcainc
ID: 17832127
i'm using vb.net by the way (that is if there is a different method for doing this)
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 17832137
"\\[style.*?\\]/style\\]"
"\\[script.*?\\[/script\\]"
but are you sure that your tags use [] and not <>?
0
 

Author Comment

by:mcainc
ID: 17832175
i didn't know i could post < > on here so i just used [ ] instead...
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:mcainc
ID: 17832182
hmm.. can you clean this up a bit with <> tags
0
 
LVL 84

Expert Comment

by:ozo
ID: 17832199
"<style.*?</style>"
"<script.*?</script>
0
 

Author Comment

by:mcainc
ID: 17832235
hmm, this doesn't seem to work:

here is the function returning a string

    Public Function RemoveStyleBlocks(ByVal strSource As String) As String
        Return Regex.Replace(strSource, "<style.*?</style>", "")
    End Function

i have a function that works for removing html tags for your reference, perhaps something else is required in your script/style regex?

    Public Function RemoveHTMLTags(ByVal strSource As String) As String
        Return Regex.Replace(strSource, "<[^>]*>", "")
    End Function
0
 
LVL 84

Expert Comment

by:ozo
ID: 17832263
if strSource spans multiple lines
Regex.Replace(strSource,"<style.*?</style>", "",RegexOptions.Singleline)
0
 

Author Comment

by:mcainc
ID: 17832270
ah great, that appears to work perfectly... thank you!
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
programming a polycom voip phone 3 80
Why is initMap returning "not a function" error. 3 112
batch file or script 4 64
print bytes of an integer 6 46
Browsers only know CSS so your awesome SASS code needs to be translated into normal CSS. Here I'll try to explain what you should aim for in order to take full advantage of SASS.
If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question