Solved

ASP.net 2.0 Stripping HTML tags!

Posted on 2008-06-12
2
282 Views
Last Modified: 2013-11-26
I'm inputting data into a sql database. The problem is this: the data is spreadsheet data for registered users.... So, I don't need it to wipe out the spreadsheet data, just strip it of all other data. Anybody have the cleaning code for this? Thanks, Chris.
0
Comment
Question by:jumpstart0321
2 Comments
 
LVL 8

Accepted Solution

by:
arhame earned 250 total points
Comment Utility
Here is a function that'll strip all your HTML from a string.  So pass the string from your database to this function and it'd return it without the HTML.

http://www.4guysfromrolla.com/webtech/042501-1.shtml

 

Function stripHTML(strHTML)

'Strips the HTML tags from strHTML using split and join
 

  'Ensure that strHTML contains something

  If len(strHTML) = 0 then

    stripHTML = strHTML

    Exit Function

  End If
 

  dim arysplit, i, j, strOutput
 

  arysplit = split(strHTML, "<")

 

  'Assuming strHTML is nonempty, we want to start iterating

  'from the 2nd array postition

  if len(arysplit(0)) > 0 then j = 1 else j = 0
 

  'Loop through each instance of the array

  for i=j to ubound(arysplit)

     'Do we find a matching > sign?

     if instr(arysplit(i), ">") then

       'If so, snip out all the text between the start of the string

       'and the > sign

       arysplit(i) = mid(arysplit(i), instr(arysplit(i), ">") + 1)

     else

       'Ah, the < was was nonmatching

       arysplit(i) = "<" & arysplit(i)

     end if

  next
 

  'Rejoin the array into a single string

  strOutput = join(arysplit, "")

  

  'Snip out the first <

  strOutput = mid(strOutput, 2-j)

  

  'Convert < and > to &lt; and &gt;

  strOutput = replace(strOutput,">","&gt;")

  strOutput = replace(strOutput,"<","&lt;")
 

  stripHTML = strOutput

End Function

Open in new window

0
 
LVL 7

Assisted Solution

by:alexpercsi
alexpercsi earned 250 total points
Comment Utility
I think it would be best if you used Regular Expressions.

Here's something i wrote on the fly, i hope it works.
using System.Text;

using System.Text.RegularExpressions;
 

string input;

string output = Regex.Replace(input, "<[a-zA-Z]{1}.*>", "");

output = Regex.Replace(output, "</[a-zA-Z]{1}.*>", "");

Open in new window

0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

IntroductionWhile developing web applications, a single page might contain many regions and each region might contain many number of controls with the capability to perform  postback. Many times you might need to perform some action on an ASP.NET po…
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now