Solved

How to strip off tags from an Html string. (500 points)

Posted on 2004-08-28
7
249 Views
Last Modified: 2013-12-23
Hello,
      I am developing an htmlSearch page for my web site. This page is opening all Html files in the site, comparing it with search words and finding the pages which match the search criteria. The problem I have is that when reading the content of an Html file, I do not know how to strip of f the tags from the content and get the text elements for seaching. I would appreciate if you could help me. By the way, I am developing my site in ASP.NET. Sorry, since there was no option for ASP.NET I chose VisualInterDev

                              Thanks a lot,
     
0
Comment
Question by:behzadmona
  • 3
7 Comments
 
LVL 11

Expert Comment

by:ajaikumarr
ID: 11920045
Hai,

On ASP you can use the below code
function StripAllHTML(byval str)
      'removes all html tags from a string, replaces them with spaces
      if isNull(str) or trim(str) = "" then
            stripAllHTML = ""
            exit function
      end if
      'this regular expression finds any html tag and it's
      'corresponding end tag and replaces them with a space
      dim objRegEXp
      set objRegEXp = new RegExp
      objRegEXp.pattern = "(\<[\/]?)([\,\:\;\%\-\/\.\\\dA-Z\="" #]*)(\>)"
      objRegEXp.global = true
      objRegEXp.ignorecase = true
      stripAllHTML = objregexp.replace(str," ")
end function

for vb.net code u can use the below one
    Friend Function StripHTML(ByVal HTMLContent As String) As String
        Try
            StripHTML = ""
            If HTMLContent.ToString.Trim <> "" Then
                Dim arysplit, i, j, strOutput
                arysplit = Microsoft.VisualBasic.Split(HTMLContent.ToString.Trim, "<")
                If Microsoft.VisualBasic.Len(arysplit(0)) > 0 Then j = 1 Else j = 0
                For i = j To Microsoft.VisualBasic.UBound(arysplit)
                    If Microsoft.VisualBasic.InStr(arysplit(i), ">") Then
                        arysplit(i) = Microsoft.VisualBasic.Mid(arysplit(i), Microsoft.VisualBasic.InStr(arysplit(i), ">") + 1)
                    Else
                        arysplit(i) = "<" & arysplit(i)
                        'arysplit(i) = arysplit(i)
                    End If
                Next
                strOutput = Microsoft.VisualBasic.Join(arysplit, "")
                StripHTML = strOutput.ToString.Trim
                If StripHTML.ToString.Trim = "<" Then StripHTML = ""
            End If
        Catch ex As Exception
              'do nothing
        End Try


Bye
Ajai
0
 
LVL 11

Accepted Solution

by:
ajaikumarr earned 500 total points
ID: 11920069
Hai,

Some more samples
Function stripHTML(strHTML)
  Dim objRegExp, strOutput
  Set objRegExp = New Regexp
  objRegExp.IgnoreCase = True
  objRegExp.Global = True
  objRegExp.Pattern = "<(.|\n)+?>"
  strOutput = objRegExp.Replace(strHTML, "")
  strOutput = Replace(strOutput, "<", "&lt;")
  strOutput = Replace(strOutput, ">", "&gt;")
  stripHTML = strOutput
  Set objRegExp = Nothing
End Function

See this too
http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=2682&lngWId=10

Bye
Ajai
0
 
LVL 11

Expert Comment

by:coopzz
ID: 12074872
0
 
LVL 11

Expert Comment

by:ajaikumarr
ID: 12356503
Hai,

I've given relavent samples for the same...

Bye
Ajai
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
New Website 15 30
ASP.NET Fix Column Width 7 209
Web Expression 4 + PHP v5.5 2 55
wordfence security dashboard differs 6 56
Now that Expression Web 4.0 (http://www.microsoft.com/expression/products/Upgrade.aspx) is free if you buy or have the full version of Expression Web 3.0, now is the best time to  migrate from FrontPage to Expression Web (http://www.frontpage-to-exp…
When setting up new project requests for our site, one of the most powerful tools our team has available to use is Axure (http://www.axure.com/). It’s a tool for creating software and web prototypes that can function and interact as if it were the a…
The purpose of this video is to demonstrate how to properly insert a Vimeo Video into a WordPress site or Blog. This will be demonstrated using a Windows 8 PC. Go to your WordPress login page. This will look like the following: mywebsite.com/wp…
The purpose of this video is to demonstrate how to exclude a particular blog category from the main blog page. This is can be used when a category already has its own tab, or you simply want certain types of posts not to show up on the main blog. …

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question