Remove HTML Tags from ShortDescription Field in VB.net ASP.net for XML Document

Hello,

I have an asp.net page that runs a sql query and makes an xml document that is transformed by an xslt as an rss feed.  The ShortDescription field has html markup in it and I need to find a way to strip out the html tags from the ShortDescription field.  Can anyone show me what I need to add to my code below  to accomplish this (in as few lines of code as possible)?

<%@ Import Namespace="System.Xml" %>
<%@ Import Namespace="System.Xml.Xsl" %>
<%@ Import Namespace="System.Text.RegularExpressions" %>
<%@ Page Language="vb"%>
<script language="vb" runat="server">
Sub Page_Load()
Dim strConnect As String
strConnect = StoreFrontConfiguration.ConnectionString
Dim strSelect As String
strSelect = "SELECT dbo.Products.Name, dbo.Products.ShortDescription, dbo.Products.ImageSmallPath, dbo.Products.uid, dbo.Products.DateModified, dbo.ProductCategory.CategoryID, dbo.Categories.Name AS CategoryName FROM dbo.Products INNER JOIN dbo.ProductCategory ON dbo.Products.uid = dbo.ProductCategory.ProductID INNER JOIN dbo.InventoryInfo ON dbo.Products.uid = dbo.InventoryInfo.ProdID INNER JOIN dbo.Categories ON dbo.ProductCategory.CategoryID = dbo.Categories.uid WHERE ((dbo.Products.IsActive = 1) AND (dbo.ProductCategory.CategoryID = 3) AND (dbo.InventoryInfo.Tracked = 0)) OR ((dbo.ProductCategory.CategoryID =3) AND (dbo.Products.IsActive = 1) AND (dbo.InventoryInfo.DefaultQTY > 0) AND (dbo.InventoryInfo.Tracked = 1));" 
Dim objDataSet As New DataSet()
Dim objConnect As New OleDbConnection(strConnect)
Dim objDataAdapter As New OleDbDataAdapter(strSelect, objConnect)
objDataAdapter.Fill(objDataSet, "model")
Dim objDataView As New DataView(objDataSet.Tables("model"))
	  objDataSet.DataSetName = "models"
'	 objDataSet.WriteXml(Server.MapPath("models.xml"),XmlWriteMode.IgnoreSchema)
'response.write(objDataSet.GetXml())
Dim doc As XmlDocument = New XmlDocument()
       Dim trans As XslTransform = new XslTransform()
      trans.Load(Server.MapPath("./category.xslt"))
Xml1.DocumentContent = objDataSet.GetXml()
        xml1.Transform = trans
End Sub
</script>
<asp:Xml id="xml1" runat="server" />

Open in new window


Thanks in Advance!
Shawn
smowerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

smowerAuthor Commented:
I can't really see any changes to the code to strip out html tags out of shortdescription field?  Did you just do some formatting changes?
Hiran DesaiSolution ArchitectCommented:
I think
Server.HTMLEncode

Open in new window

will work fine if you don't want to remove.
Bob LearnedCommented:
...or, a CData element, which is not parsed by the XML parser.

XML CDATA
http://www.w3schools.com/xml/xml_cdata.asp
JavaScript Best Practices

Save hours in development time and avoid common mistakes by learning the best practices to use for JavaScript.

Jesus RodriguezIT ManagerCommented:
use the replace,substring and charindex fuctions combine to accomplish what you need on the SQL query. Some examples of that they can do

SELECT REPLACE('123456789','456','')
result:= 123789

DECLARE @FullName VarChar(25)
SET @FullName = 'www.java2s.com'
SELECT SUBSTRING(@FullName, 1,CHARINDEX('java2s', @FullName) - 1)

can you post an example of the Field with the html markup ??
smowerAuthor Commented:
Thank you.  Here is an example of a record from the ShortDescription field with html markup.

<b>Replaced by model <a href="http://www.competitiveedgeproducts.com/detail.aspx?ID=829">4428</a></b> - New Lifetime Products 4 Pack 44428 White Granite 48 x 24 inch Adjustable Folding Tables comes with molded high-density polyethylene top and patented st
Jesus RodriguezIT ManagerCommented:
Here is the complete code base on your Field Short description

DECLARE @FullName VarChar(max)
DECLARE @FIRSTPOS VARCHAR(3)
DECLARE @SECPOS VARCHAR(3)

SET @FullName = '<b>Replaced by model <a href="http://www.competitiveedgeproducts.com/detail.aspx?ID=829">4428</a></b> - New Lifetime Products 4 Pack 44428 White Granite 48 x 24 inch Adjustable Folding Tables comes with molded high-density polyethylene top and patented st '
SET @FIRSTPOS=CHARINDEX('<a href', @FullName)
SET @SECPOS=CHARINDEX('">', @FullName)

SELECT REPLACE(REPLACE(REPLACE(@FULLNAME,SUBSTRING(@FULLNAME,CHARINDEX('<a href', @FullName),CONVERT(INT,@SECPOS)-CONVERT(INT,@FIRSTPOS)+2),''),'<b>',''),'</a></b>','')

Open in new window


the result will be
Replaced by model 4428 - New Lifetime Products 4 Pack 44428 White Granite 48 x 24 inch Adjustable Folding Tables comes with molded high-density polyethylene top and patented st

Try an let me know
smowerAuthor Commented:
Thanks so can we not use regular expressions to replace with sql?  If not, can we go to the vb.net code and run a regular expression on a node of the xml document that is created to replace?   I was hoping we could do something dynamic like:

xml1.DocumentContent.ShortDescriptionNode = Regex.Replace(objDataSet.GetXml(), "<[^>]*>", "")

or
Regex.Replace(objDataSet.Tables.ShortDescription,"<[^>]*>", "")

however, I don't have the code formatted right so it is not working.  Do you know if this is a way of doing it and how it would be set up?

Thanks,
Shawn
Jesus RodriguezIT ManagerCommented:
After this line
objDataAdapter.Fill(objDataSet, "model")

you can go thru every datarow of the dataset and then modifie the shortdescription with the regular expression

For Each Rw as DataRow in objDataset.Tables("model").Rows
   RW.BeginEdit
   RW("SHortDescription")=Regex.Replace(RW("SHortDescription"),"<[^>]*>", "")
   RW.EndEdit()
   RW.AcceptChanges
Next

Open in new window


And later do the rest of the code

Dim objDataView As New DataView(objDataSet.Tables("model"))
	  objDataSet.DataSetName = "models"
'	 objDataSet.WriteXml(Server.MapPath("models.xml"),XmlWriteMode.IgnoreSchema)
'response.write(objDataSet.GetXml())
Dim doc As XmlDocument = New XmlDocument()
       Dim trans As XslTransform = new XslTransform()
      trans.Load(Server.MapPath("./category.xslt"))
Xml1.DocumentContent = objDataSet.GetXml()
        xml1.Transform = trans
End Sub
</script>
<asp:Xml id="xml1" runat="server" />

Open in new window

Jesus RodriguezIT ManagerCommented:
Your final code will be something like this
<%@ Import Namespace="System.Xml" %>
<%@ Import Namespace="System.Xml.Xsl" %>
<%@ Import Namespace="System.Text.RegularExpressions" %>
<%@ Page Language="vb"%>
<script language="vb" runat="server">
Sub Page_Load()
Dim strConnect As String
strConnect = StoreFrontConfiguration.ConnectionString
Dim strSelect As String
strSelect = "SELECT dbo.Products.Name, dbo.Products.ShortDescription, dbo.Products.ImageSmallPath, dbo.Products.uid, dbo.Products.DateModified, dbo.ProductCategory.CategoryID, dbo.Categories.Name AS CategoryName FROM dbo.Products INNER JOIN dbo.ProductCategory ON dbo.Products.uid = dbo.ProductCategory.ProductID INNER JOIN dbo.InventoryInfo ON dbo.Products.uid = dbo.InventoryInfo.ProdID INNER JOIN dbo.Categories ON dbo.ProductCategory.CategoryID = dbo.Categories.uid WHERE ((dbo.Products.IsActive = 1) AND (dbo.ProductCategory.CategoryID = 3) AND (dbo.InventoryInfo.Tracked = 0)) OR ((dbo.ProductCategory.CategoryID =3) AND (dbo.Products.IsActive = 1) AND (dbo.InventoryInfo.DefaultQTY > 0) AND (dbo.InventoryInfo.Tracked = 1));" 
Dim objDataSet As New DataSet()
Dim objConnect As New OleDbConnection(strConnect)
Dim objDataAdapter As New OleDbDataAdapter(strSelect, objConnect)
objDataAdapter.Fill(objDataSet, "model")
For Each Rw as DataRow in objDataset.Tables("model").Rows
   RW.BeginEdit
   RW("ShortDescription")=Regex.Replace(RW("ShortDescription"),"<[^>]*>", "")
   RW.EndEdit()
   RW.AcceptChanges
Next 
Dim objDataView As New DataView(objDataSet.Tables("model"))
	  objDataSet.DataSetName = "models"
'	 objDataSet.WriteXml(Server.MapPath("models.xml"),XmlWriteMode.IgnoreSchema)
'response.write(objDataSet.GetXml())
Dim doc As XmlDocument = New XmlDocument()
       Dim trans As XslTransform = new XslTransform()
      trans.Load(Server.MapPath("./category.xslt"))
Xml1.DocumentContent = objDataSet.GetXml()
        xml1.Transform = trans
End Sub
</script>
<asp:Xml id="xml1" runat="server" /> 

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
smowerAuthor Commented:
Awesome! That worked!  Thank you so much!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.