Link to home
Start Free TrialLog in
Avatar of NewtoAllThis
NewtoAllThis

asked on

Strip HTML tags from data in database rows

Hi,

I'm trying to do something like this, but I need help.

I have a table which contains Adverts for a magazine, there is a column cotaining Ad Headings. I have also stored HTML tags around the ad headings in the table. Each advert is displayed on many pages on my site.

But I want to display an ad on some pages without the HTML tags. How do I strip the html tags from the database for this page without changing the ad in each other page. All my code is ASP and javascript.

This is my code that outputs the adverts, where html tags are included.

r1 = Q("AdHead")

Response.Write("<td valign='top' width=400 rowspan=2>"+r1+"</td>");   //Header ok but tags included

But when I do this

r1 = Q("AdHead")
r1 = Replace(r1, " ","")

Response.Write("<td valign='top' width=400 rowspan=2>"+r1+"</td>");   //Header

I get this error

Microsoft JScript runtime error '800a138f'
Object expected

?
 

Hope this makes sense.
Thanks
Avatar of markhoy
markhoy
Flag of United Kingdom of Great Britain and Northern Ireland image

you need to use Regular expressions to remove html tags.
(http://www.aspfaqs.com/aspfaqs/ShowCategory.asp?CatID=16).

http://www.4guysfromrolla.com/webtech/073000-1.shtml
http://www.4guysfromrolla.com/webtech/042501-1.shtml

Removing All HTML Tags from a String Using Regular Expressions


----------------------------------------------------------

If you need to remove all HTML tags from a file e.g.  to display all HTML tags on a Web page, but as the raw HTML tags (like showing <B>bold</B> as opposed to bold). The script presented in this artcle will accomplish these tasks with ease!

To accomplish this, included in this article is the code to a function, ClearHTMLTags, which has the following definition:

function ClearHTMLTags(strHTML, intWorkFlow)  


Where strHTML is the string to be cleared of HTML tags and intWorkFlow determines how to clear the HTML tag... a value of 0 simply strips the HTML tags while a value of 1 displays the HTML tags as text in the document (like showing <B>bold</B> as opposed to bold).

The code for ClearHTMLTags is presented below... there is also a live demo to give the script a run. Note that the ClearHTMLTags function uses regular expressions to ease the hunt and removal of HTML tags...




--------------------------------------------------------------------------------


'[ClearHTMLTags]
     
'Coded by Jóhann Haukur Gunnarsson
'joi@innn.is
     
'  Purpose: This function clears all HTML tags from a
'           string using Regular Expressions.
'   Inputs: strHTML;
'            A string to be cleared of HTML TAGS
'     intWorkFlow;
'            An integer that if equals to 0 runs only the RegExp filter
'              .. 1 runs only the HTML source render filter
'              .. 2 runs both the RegExp and the HTML source render
'              .. >2 defaults to 0
'  Returns: A string that has been filtered by the function
     
function ClearHTMLTags(strHTML, intWorkFlow)
  'Variables used in the function
         
  dim regEx, strTagLess
         
  '---------------------------------------
  strTagless = strHTML
  'Move the string into a private variable
  'within the function
  '---------------------------------------

  'regEx initialization
  '---------------------------------------
  set regEx = New RegExp
  'Creates a regexp object          
  regEx.IgnoreCase = True
  'Don't give frat about case sensitivity
  regEx.Global = True
  'Global applicability
  '---------------------------------------

  'Phase I
  '     "bye bye html tags"
  if intWorkFlow <> 1 then
    '---------------------------------------
    regEx.Pattern = "<[^>]*>"
    'this pattern mathces any html tag
    strTagLess = regEx.Replace(strTagLess, "")
    'all html tags are stripped
    '---------------------------------------
  end if
         
  'Phase II
  '     "bye bye rouge leftovers"
  '     "or, I want to render the source"
  '     "as html."

  '---------------------------------------
  'We *might* still have rouge < and > 
  'let's be positive that those that remain
  'are changed into html characters
  '---------------------------------------    

  if intWorkFlow > 0 and intWorkFlow < 3 then
    regEx.Pattern = "[<]"
    'matches a single <
    strTagLess = regEx.Replace(strTagLess, "<")

    regEx.Pattern = "[>]"
    'matches a single >
    strTagLess = regEx.Replace(strTagLess, ">")
    '---------------------------------------
  end if
         
  'Clean up
  '---------------------------------------
  set regEx = nothing
  'Destroys the regExp object
  '---------------------------------------    
         
  '---------------------------------------
  ClearHTMLTags = strTagLess
  'The results are passed back
  '---------------------------------------
end function

 



OR,


function stripHTMLTags(val)

 dim re

 set re = new RegExp
 re.pattern = "<[\w/]+[^<>]*>"
 re.global=true
 stripHTMLTags = re.replace(val,"")

end function

response.write stripHTMLTags("<html><body>This is <br>a <a href='http://www.yahoo.com'>test<a> <b>of</b> the emergency broadcast system.  a <> b <> c;  a > b and b<c </body></html>")

%>
Avatar of NewtoAllThis
NewtoAllThis

ASKER

I'm using Javascript, I've tried converting vb code to java but I'm getting errors. Any javascript functions please.


Cheers
ASKER CERTIFIED SOLUTION
Avatar of markhoy
markhoy
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Here's a javascript function I created on your other post.

function Strip(str) {
 return str.replace(/<[\w/]+[^<>]*>/g, "");
}
I might have made a mistake in the reg exp, try this simpler one.

function Strip(str) {
 return str.replace(/<[^>]*>/g, "");
}

don't try parsing a whole HTML document on this though as it will break on certain things. Its probably ok for your needs though.
Don't store the HTML in the table. Reverse the logic and add the HTML to the pages it's needed on.  It's a rule of thumb for me not to store formating in the database, or in my com objects. Might be a bit of extra typing in the beggining to update those pages but worth it in the end as your db data will be more flexible.



This question has been classified abandoned. I will make a recommendation to the
moderators on its resolution in a week or two. I appreciate any comments
that would help me to make a recommendation.

<note>
Unless it is clear to me that the question has been answered I will recommend delete.  It is possible that a Grade less than A will be given if no expert makes a case for an A grade. It is assumed that any participant not responding to this request is no longer interested in its final disposition.
</note>

If the user does not know how to close the question, the options are here:
https://www.experts-exchange.com/help/closing.jsp


Cd&

I made this function in ASP to strip HTML-tags from a string except some tags..

Usage is StripHTMLTag("<p><b><i>Hello</i></b></p>","<b>,<i>")

Returns "<b><i>Hello</i></b>"

Function StripHTMLTag(aText,aValidTags)
      dim Result,Character
      dim ValidTags
      
      ValidTags=split(aValidTags,",")
      
      Result=""
      counter=1
      
      while counter<len(aText)
            'Get next character
            Character=mid(aText,counter,1)
            
            'is character begin of tag?
            if Character="<" then
                  'Default the tag is NOT valid
                  isValid=False
                  for each item in ValidTags
                        'Functions to check if a tag is valid
                        if ucase(mid(aText,counter,len(item)))=ucase(mid(item,1,len(item))) then
                              isValid=True
                        elseif ucase(mid(aText,counter,len(item)-1))=ucase(mid(item,1,len(item)-1)) AND mid(aText,counter+len(item)-1,1)=" " then
                              isValid=True
                        elseif ucase(mid(aText,counter+2,len(item)-1))=ucase(mid(item,2,len(item)-1)) then
                              isValid=True
                        end if
                  next
                  if isValid then
                        'If the tag is valid copy the whole tag to the result
                        while mid(aText,counter,1)<>">"
                              Result=Result & mid(aText,counter,1)
                              counter=counter+1
                        wend
                  else
                        'If the tag is not valid skip the whole tag by counting until ">"
                        while mid(aText,counter,1)<>">"
                              counter=counter+1
                        wend
                        'Now the next character is ">" so move to the next
                        counter=counter+1
                  end if
            else
                  'Next character please...
                  Result = Result & mid(aText,counter,1)
                  counter=counter+1
            end if
      wend
      'All done, return the result
      StripHTMLTag=Result
End Function

Greetings, Arjan Bosboom
No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:

Accept markhoy's comment as answer

Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

Programming_Gal
EE Cleanup Volunteer