Solved

ASP.net 2.0 Stripping HTML tags!

Posted on 2008-06-12
2
293 Views
Last Modified: 2013-11-26
I'm inputting data into a sql database. The problem is this: the data is spreadsheet data for registered users.... So, I don't need it to wipe out the spreadsheet data, just strip it of all other data. Anybody have the cleaning code for this? Thanks, Chris.
0
Comment
Question by:jumpstart0321
2 Comments
 
LVL 8

Accepted Solution

by:
arhame earned 250 total points
ID: 21780064
Here is a function that'll strip all your HTML from a string.  So pass the string from your database to this function and it'd return it without the HTML.

http://www.4guysfromrolla.com/webtech/042501-1.shtml

 

Function stripHTML(strHTML)
'Strips the HTML tags from strHTML using split and join
 
  'Ensure that strHTML contains something
  If len(strHTML) = 0 then
    stripHTML = strHTML
    Exit Function
  End If
 
  dim arysplit, i, j, strOutput
 
  arysplit = split(strHTML, "<")
 
  'Assuming strHTML is nonempty, we want to start iterating
  'from the 2nd array postition
  if len(arysplit(0)) > 0 then j = 1 else j = 0
 
  'Loop through each instance of the array
  for i=j to ubound(arysplit)
     'Do we find a matching > sign?
     if instr(arysplit(i), ">") then
       'If so, snip out all the text between the start of the string
       'and the > sign
       arysplit(i) = mid(arysplit(i), instr(arysplit(i), ">") + 1)
     else
       'Ah, the < was was nonmatching
       arysplit(i) = "<" & arysplit(i)
     end if
  next
 
  'Rejoin the array into a single string
  strOutput = join(arysplit, "")
  
  'Snip out the first <
  strOutput = mid(strOutput, 2-j)
  
  'Convert < and > to &lt; and &gt;
  strOutput = replace(strOutput,">","&gt;")
  strOutput = replace(strOutput,"<","&lt;")
 
  stripHTML = strOutput
End Function

Open in new window

0
 
LVL 7

Assisted Solution

by:alexpercsi
alexpercsi earned 250 total points
ID: 21780506
I think it would be best if you used Regular Expressions.

Here's something i wrote on the fly, i hope it works.
using System.Text;
using System.Text.RegularExpressions;
 
string input;
string output = Regex.Replace(input, "<[a-zA-Z]{1}.*>", "");
output = Regex.Replace(output, "</[a-zA-Z]{1}.*>", "");

Open in new window

0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Just a quick little trick I learned recently.  Now that I'm using jQuery with abandon in my asp.net applications, I have grown tired of the following syntax:      (CODE) I suppose it just offends my sense of decency to put inline VBScript on a…
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question