?
Solved

Regular Expressions

Posted on 2003-03-26
6
Medium Priority
?
147 Views
Last Modified: 2010-04-07
Hey all, relatively simple question here. I've ripped the contents of a browserobject into a text box and would like to remove all javascript from the text box. I basically just need a pattern to filter out anything encapsulated between <script> ... </script>. Sounds easy eh? Can't seem to get it to work though.
0
Comment
Question by:Lothian
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 4

Expert Comment

by:JohnChapin
ID: 8212759
Lothian,
Try this code:

Dim i As Integer
Dim aString As String
Dim strArray1() As String
Dim strArray2() As String
Dim NewString As String
aString = "akdjfqhrfqpowwv ;,a<script>1234567890987654321</script> asdfsjdfd<script>1234567890987654321</script>"

strArray1 = Split(aString, "<script>")
Debug.Print " split1 " & strArray1(0)
Debug.Print " split2 " & strArray1(1)
If UBound(strArray1) = 0 Then
    NewString = aString ' no <script>s
    Exit Sub
    End If
NewString = strArray1(0)
For i = 1 To UBound(strArray1)
    Debug.Print strArray1(i)
    strArray2 = Split(strArray1(i), "</script>")
    If UBound(strArray2) > 1 Then
        Debug.Print "buggo" 'not good html
        Exit Sub
        End If
    Debug.Print strArray2(0)
    Debug.Print strArray2(1)
    NewString = NewString & strArray2(1)
Next i
Debug.Print NewString
0
 
LVL 6

Expert Comment

by:DominicCronin
ID: 8213254
Here's the solution in vbscript using a regexp
=======================================
Dim re ' As VBScript.RegExp
Set re = CreateObject("VBScript.RegExp")
re.pattern = "<script>.*?</script>"
re.global = true
Dim testString ' As String
testString = "We<script>Some stuff</script> have <script>Some more stuff</script>succeeded. <script>and some more</script> Indeed we have."

Dim result ' As String
Wscript.echo re.Replace(testString, vbNullstring)
=================================================
The key points here are the setting of global to true, and the use of a question mark to force a non-greedy match.
0
 

Author Comment

by:Lothian
ID: 8213942
That's for the pattern Dominic, but it's still not working. Maybe it's in my code somewhere. Here's where I'm trying to use it:

Private Sub cmdGrab_Click()
     txtDump.Text = RemoveHTML(webWindow.Document.body.innerHTML)
End Sub

Function RemoveHTML(strText)
    Dim RegEx
    Set RegEx = New RegExp
    RegEx.Global = True
    RegEx.Pattern = "<[^>]*>"
    strText = Replace(LCase(strText), "<br>", Chr(10))
    strText = Replace(LCase(strText), "&nbsp;", " ")
    RemoveHTML = RegEx.Replace(strText, vbNullString)
    RegEx.Pattern = "<[^>]*>"
End Function


I want to strip all tags, as well as any javascript.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 6

Accepted Solution

by:
DominicCronin earned 100 total points
ID: 8220360
Bah - Lothian - test the code before you say it doesn't work. The reason your code doesn't fly is that you're using replace rather than regexp.replace - two different things entirely.

Here's the freebie:

If you want a more generic version that handles all markup tags, try this:
============================
Dim re ' As VBScript.RegExp
Set re = CreateObject("VBScript.RegExp")
re.pattern = "<.*?>.*?</.*?>"
re.global = true
Dim testString ' As String
testString = "We<script>Some stuff</script> have <blah>Some more stuff</blah>succeeded.<script>and some more</script> Indeed we have."

Dim result ' As String
Wscript.echo re.Replace(testString, vbNullstring)
============================
0
 

Expert Comment

by:CleanupPing
ID: 8531542
Hi Lothian,
This old question (QID 20564163) needs to be finalized -- accept an answer, split points, or get a refund.  Please see http://www.cityofangels.com/Experts/Closing.htm for information and options.
0
 
LVL 6

Expert Comment

by:GPrentice00
ID: 9440972
No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:

 -->Accept DominicCronin's comment as Answer

Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER

GPrentice00
Cleanup Volunteer
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Most everyone who has done any programming in VB6 knows that you can do something in code like Debug.Print MyVar and that when the program runs from the IDE, the value of MyVar will be displayed in the Immediate Window. Less well known is Debug.Asse…
Since upgrading to Office 2013 or higher installing the Smart Indenter addin will fail. This article will explain how to install it so it will work regardless of the Office version installed.
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…
Suggested Courses

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question