[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

Regular Expression to remove JavaScript / CSS from HTML source

Posted on 2006-10-29
8
Medium Priority
?
417 Views
Last Modified: 2013-11-19
So far I have a regex that I use to strip the HTML tags from a page however this doesnt work correctly with CSS and JavaScript...

Im looking for a regular expression to remove script (javascript, etc) and styles from the html source i have in a local string variable

examples of what i need to remove:

[style type="text/css"] blah [/style]
[style] blah [/style]
[script language="JavaScript"] blah [/script]
[script type="text/javascript"] blah [/script]

is this possible w/ regexp?
0
Comment
Question by:mcainc
  • 5
  • 3
8 Comments
 

Author Comment

by:mcainc
ID: 17832127
i'm using vb.net by the way (that is if there is a different method for doing this)
0
 
LVL 85

Accepted Solution

by:
ozo earned 2000 total points
ID: 17832137
"\\[style.*?\\]/style\\]"
"\\[script.*?\\[/script\\]"
but are you sure that your tags use [] and not <>?
0
 

Author Comment

by:mcainc
ID: 17832175
i didn't know i could post < > on here so i just used [ ] instead...
0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 

Author Comment

by:mcainc
ID: 17832182
hmm.. can you clean this up a bit with <> tags
0
 
LVL 85

Expert Comment

by:ozo
ID: 17832199
"<style.*?</style>"
"<script.*?</script>
0
 

Author Comment

by:mcainc
ID: 17832235
hmm, this doesn't seem to work:

here is the function returning a string

    Public Function RemoveStyleBlocks(ByVal strSource As String) As String
        Return Regex.Replace(strSource, "<style.*?</style>", "")
    End Function

i have a function that works for removing html tags for your reference, perhaps something else is required in your script/style regex?

    Public Function RemoveHTMLTags(ByVal strSource As String) As String
        Return Regex.Replace(strSource, "<[^>]*>", "")
    End Function
0
 
LVL 85

Expert Comment

by:ozo
ID: 17832263
if strSource spans multiple lines
Regex.Replace(strSource,"<style.*?</style>", "",RegexOptions.Singleline)
0
 

Author Comment

by:mcainc
ID: 17832270
ah great, that appears to work perfectly... thank you!
0

Featured Post

The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The SignAloud Glove is capable of translating American Sign Language signs into text and audio.
We live in a world of interfaces like the one in the title picture. VBA also allows to use interfaces which offers a lot of possibilities. This article describes how to use interfaces in VBA and how to work around their bugs.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
Loops Section Overview

590 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question