Solved

How to strip off tags from an Html string. (500 points)

Posted on 2004-08-28
7
247 Views
Last Modified: 2013-12-23
Hello,
      I am developing an htmlSearch page for my web site. This page is opening all Html files in the site, comparing it with search words and finding the pages which match the search criteria. The problem I have is that when reading the content of an Html file, I do not know how to strip of f the tags from the content and get the text elements for seaching. I would appreciate if you could help me. By the way, I am developing my site in ASP.NET. Sorry, since there was no option for ASP.NET I chose VisualInterDev

                              Thanks a lot,
     
0
Comment
Question by:behzadmona
  • 3
7 Comments
 
LVL 11

Expert Comment

by:ajaikumarr
ID: 11920045
Hai,

On ASP you can use the below code
function StripAllHTML(byval str)
      'removes all html tags from a string, replaces them with spaces
      if isNull(str) or trim(str) = "" then
            stripAllHTML = ""
            exit function
      end if
      'this regular expression finds any html tag and it's
      'corresponding end tag and replaces them with a space
      dim objRegEXp
      set objRegEXp = new RegExp
      objRegEXp.pattern = "(\<[\/]?)([\,\:\;\%\-\/\.\\\dA-Z\="" #]*)(\>)"
      objRegEXp.global = true
      objRegEXp.ignorecase = true
      stripAllHTML = objregexp.replace(str," ")
end function

for vb.net code u can use the below one
    Friend Function StripHTML(ByVal HTMLContent As String) As String
        Try
            StripHTML = ""
            If HTMLContent.ToString.Trim <> "" Then
                Dim arysplit, i, j, strOutput
                arysplit = Microsoft.VisualBasic.Split(HTMLContent.ToString.Trim, "<")
                If Microsoft.VisualBasic.Len(arysplit(0)) > 0 Then j = 1 Else j = 0
                For i = j To Microsoft.VisualBasic.UBound(arysplit)
                    If Microsoft.VisualBasic.InStr(arysplit(i), ">") Then
                        arysplit(i) = Microsoft.VisualBasic.Mid(arysplit(i), Microsoft.VisualBasic.InStr(arysplit(i), ">") + 1)
                    Else
                        arysplit(i) = "<" & arysplit(i)
                        'arysplit(i) = arysplit(i)
                    End If
                Next
                strOutput = Microsoft.VisualBasic.Join(arysplit, "")
                StripHTML = strOutput.ToString.Trim
                If StripHTML.ToString.Trim = "<" Then StripHTML = ""
            End If
        Catch ex As Exception
              'do nothing
        End Try


Bye
Ajai
0
 
LVL 11

Accepted Solution

by:
ajaikumarr earned 500 total points
ID: 11920069
Hai,

Some more samples
Function stripHTML(strHTML)
  Dim objRegExp, strOutput
  Set objRegExp = New Regexp
  objRegExp.IgnoreCase = True
  objRegExp.Global = True
  objRegExp.Pattern = "<(.|\n)+?>"
  strOutput = objRegExp.Replace(strHTML, "")
  strOutput = Replace(strOutput, "<", "&lt;")
  strOutput = Replace(strOutput, ">", "&gt;")
  stripHTML = strOutput
  Set objRegExp = Nothing
End Function

See this too
http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=2682&lngWId=10

Bye
Ajai
0
 
LVL 11

Expert Comment

by:coopzz
ID: 12074872
0
 
LVL 11

Expert Comment

by:ajaikumarr
ID: 12356503
Hai,

I've given relavent samples for the same...

Bye
Ajai
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Colabnet Subversion - move repository 6 35
you tube .com 2 189
Problem to event 1 674
Column Spacing 3 36
When deciding to adopt any help desk solutions many factors should be explored before taking decisions. This will change from business to another but in general there are some kind of rule of thumb. Here are some quick tips: Do we need only ticket…
This article provides a case study on how our local youth baseball league deployed a new website, including the platform selection, implementation and benefits to the league.
The purpose of this video is to demonstrate how to make a WordPress Site faster and smaller in size by cleaning up the database. This will be demonstrated using a Windows 8 PC. Plugin WP Optimize will be used. Go to your WordPress login page. T…
The purpose of this video is to demonstrate how to properly insert a Vimeo Video into a WordPress site or Blog. This will be demonstrated using a Windows 8 PC. Go to your WordPress login page. This will look like the following: mywebsite.com/wp…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now