Solved

Remove HTML Tags from ShortDescription Field in VB.net ASP.net for XML Document

Posted on 2014-01-01
12
618 Views
Last Modified: 2014-01-02
Hello,

I have an asp.net page that runs a sql query and makes an xml document that is transformed by an xslt as an rss feed.  The ShortDescription field has html markup in it and I need to find a way to strip out the html tags from the ShortDescription field.  Can anyone show me what I need to add to my code below  to accomplish this (in as few lines of code as possible)?

<%@ Import Namespace="System.Xml" %>
<%@ Import Namespace="System.Xml.Xsl" %>
<%@ Import Namespace="System.Text.RegularExpressions" %>
<%@ Page Language="vb"%>
<script language="vb" runat="server">
Sub Page_Load()
Dim strConnect As String
strConnect = StoreFrontConfiguration.ConnectionString
Dim strSelect As String
strSelect = "SELECT dbo.Products.Name, dbo.Products.ShortDescription, dbo.Products.ImageSmallPath, dbo.Products.uid, dbo.Products.DateModified, dbo.ProductCategory.CategoryID, dbo.Categories.Name AS CategoryName FROM dbo.Products INNER JOIN dbo.ProductCategory ON dbo.Products.uid = dbo.ProductCategory.ProductID INNER JOIN dbo.InventoryInfo ON dbo.Products.uid = dbo.InventoryInfo.ProdID INNER JOIN dbo.Categories ON dbo.ProductCategory.CategoryID = dbo.Categories.uid WHERE ((dbo.Products.IsActive = 1) AND (dbo.ProductCategory.CategoryID = 3) AND (dbo.InventoryInfo.Tracked = 0)) OR ((dbo.ProductCategory.CategoryID =3) AND (dbo.Products.IsActive = 1) AND (dbo.InventoryInfo.DefaultQTY > 0) AND (dbo.InventoryInfo.Tracked = 1));" 
Dim objDataSet As New DataSet()
Dim objConnect As New OleDbConnection(strConnect)
Dim objDataAdapter As New OleDbDataAdapter(strSelect, objConnect)
objDataAdapter.Fill(objDataSet, "model")
Dim objDataView As New DataView(objDataSet.Tables("model"))
	  objDataSet.DataSetName = "models"
'	 objDataSet.WriteXml(Server.MapPath("models.xml"),XmlWriteMode.IgnoreSchema)
'response.write(objDataSet.GetXml())
Dim doc As XmlDocument = New XmlDocument()
       Dim trans As XslTransform = new XslTransform()
      trans.Load(Server.MapPath("./category.xslt"))
Xml1.DocumentContent = objDataSet.GetXml()
        xml1.Transform = trans
End Sub
</script>
<asp:Xml id="xml1" runat="server" />

Open in new window


Thanks in Advance!
Shawn
0
Comment
Question by:smower
12 Comments
 

Author Comment

by:smower
Comment Utility
I can't really see any changes to the code to strip out html tags out of shortdescription field?  Did you just do some formatting changes?
0
 
LVL 4

Expert Comment

by:ItWorked
Comment Utility
I think
Server.HTMLEncode

Open in new window

will work fine if you don't want to remove.
0
 
LVL 96

Expert Comment

by:Bob Learned
Comment Utility
...or, a CData element, which is not parsed by the XML parser.

XML CDATA
http://www.w3schools.com/xml/xml_cdata.asp
0
 
LVL 13

Expert Comment

by:Jesus Rodriguez
Comment Utility
use the replace,substring and charindex fuctions combine to accomplish what you need on the SQL query. Some examples of that they can do

SELECT REPLACE('123456789','456','')
result:= 123789

DECLARE @FullName VarChar(25)
SET @FullName = 'www.java2s.com'
SELECT SUBSTRING(@FullName, 1,CHARINDEX('java2s', @FullName) - 1)

can you post an example of the Field with the html markup ??
0
 

Author Comment

by:smower
Comment Utility
Thank you.  Here is an example of a record from the ShortDescription field with html markup.

<b>Replaced by model <a href="http://www.competitiveedgeproducts.com/detail.aspx?ID=829">4428</a></b> - New Lifetime Products 4 Pack 44428 White Granite 48 x 24 inch Adjustable Folding Tables comes with molded high-density polyethylene top and patented st
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 13

Expert Comment

by:Jesus Rodriguez
Comment Utility
Here is the complete code base on your Field Short description

DECLARE @FullName VarChar(max)
DECLARE @FIRSTPOS VARCHAR(3)
DECLARE @SECPOS VARCHAR(3)

SET @FullName = '<b>Replaced by model <a href="http://www.competitiveedgeproducts.com/detail.aspx?ID=829">4428</a></b> - New Lifetime Products 4 Pack 44428 White Granite 48 x 24 inch Adjustable Folding Tables comes with molded high-density polyethylene top and patented st '
SET @FIRSTPOS=CHARINDEX('<a href', @FullName)
SET @SECPOS=CHARINDEX('">', @FullName)

SELECT REPLACE(REPLACE(REPLACE(@FULLNAME,SUBSTRING(@FULLNAME,CHARINDEX('<a href', @FullName),CONVERT(INT,@SECPOS)-CONVERT(INT,@FIRSTPOS)+2),''),'<b>',''),'</a></b>','')

Open in new window


the result will be
Replaced by model 4428 - New Lifetime Products 4 Pack 44428 White Granite 48 x 24 inch Adjustable Folding Tables comes with molded high-density polyethylene top and patented st

Try an let me know
0
 

Author Comment

by:smower
Comment Utility
Thanks so can we not use regular expressions to replace with sql?  If not, can we go to the vb.net code and run a regular expression on a node of the xml document that is created to replace?   I was hoping we could do something dynamic like:

xml1.DocumentContent.ShortDescriptionNode = Regex.Replace(objDataSet.GetXml(), "<[^>]*>", "")

or
Regex.Replace(objDataSet.Tables.ShortDescription,"<[^>]*>", "")

however, I don't have the code formatted right so it is not working.  Do you know if this is a way of doing it and how it would be set up?

Thanks,
Shawn
0
 
LVL 13

Expert Comment

by:Jesus Rodriguez
Comment Utility
After this line
objDataAdapter.Fill(objDataSet, "model")

you can go thru every datarow of the dataset and then modifie the shortdescription with the regular expression

For Each Rw as DataRow in objDataset.Tables("model").Rows
   RW.BeginEdit
   RW("SHortDescription")=Regex.Replace(RW("SHortDescription"),"<[^>]*>", "")
   RW.EndEdit()
   RW.AcceptChanges
Next

Open in new window


And later do the rest of the code

Dim objDataView As New DataView(objDataSet.Tables("model"))
	  objDataSet.DataSetName = "models"
'	 objDataSet.WriteXml(Server.MapPath("models.xml"),XmlWriteMode.IgnoreSchema)
'response.write(objDataSet.GetXml())
Dim doc As XmlDocument = New XmlDocument()
       Dim trans As XslTransform = new XslTransform()
      trans.Load(Server.MapPath("./category.xslt"))
Xml1.DocumentContent = objDataSet.GetXml()
        xml1.Transform = trans
End Sub
</script>
<asp:Xml id="xml1" runat="server" />

Open in new window

0
 
LVL 13

Accepted Solution

by:
Jesus Rodriguez earned 500 total points
Comment Utility
Your final code will be something like this
<%@ Import Namespace="System.Xml" %>
<%@ Import Namespace="System.Xml.Xsl" %>
<%@ Import Namespace="System.Text.RegularExpressions" %>
<%@ Page Language="vb"%>
<script language="vb" runat="server">
Sub Page_Load()
Dim strConnect As String
strConnect = StoreFrontConfiguration.ConnectionString
Dim strSelect As String
strSelect = "SELECT dbo.Products.Name, dbo.Products.ShortDescription, dbo.Products.ImageSmallPath, dbo.Products.uid, dbo.Products.DateModified, dbo.ProductCategory.CategoryID, dbo.Categories.Name AS CategoryName FROM dbo.Products INNER JOIN dbo.ProductCategory ON dbo.Products.uid = dbo.ProductCategory.ProductID INNER JOIN dbo.InventoryInfo ON dbo.Products.uid = dbo.InventoryInfo.ProdID INNER JOIN dbo.Categories ON dbo.ProductCategory.CategoryID = dbo.Categories.uid WHERE ((dbo.Products.IsActive = 1) AND (dbo.ProductCategory.CategoryID = 3) AND (dbo.InventoryInfo.Tracked = 0)) OR ((dbo.ProductCategory.CategoryID =3) AND (dbo.Products.IsActive = 1) AND (dbo.InventoryInfo.DefaultQTY > 0) AND (dbo.InventoryInfo.Tracked = 1));" 
Dim objDataSet As New DataSet()
Dim objConnect As New OleDbConnection(strConnect)
Dim objDataAdapter As New OleDbDataAdapter(strSelect, objConnect)
objDataAdapter.Fill(objDataSet, "model")
For Each Rw as DataRow in objDataset.Tables("model").Rows
   RW.BeginEdit
   RW("ShortDescription")=Regex.Replace(RW("ShortDescription"),"<[^>]*>", "")
   RW.EndEdit()
   RW.AcceptChanges
Next 
Dim objDataView As New DataView(objDataSet.Tables("model"))
	  objDataSet.DataSetName = "models"
'	 objDataSet.WriteXml(Server.MapPath("models.xml"),XmlWriteMode.IgnoreSchema)
'response.write(objDataSet.GetXml())
Dim doc As XmlDocument = New XmlDocument()
       Dim trans As XslTransform = new XslTransform()
      trans.Load(Server.MapPath("./category.xslt"))
Xml1.DocumentContent = objDataSet.GetXml()
        xml1.Transform = trans
End Sub
</script>
<asp:Xml id="xml1" runat="server" /> 

Open in new window

0
 

Author Closing Comment

by:smower
Comment Utility
Awesome! That worked!  Thank you so much!
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Just a quick little trick I learned recently.  Now that I'm using jQuery with abandon in my asp.net applications, I have grown tired of the following syntax:      (CODE) I suppose it just offends my sense of decency to put inline VBScript on a…
Creating an analog clock UserControl seems fairly straight forward.  It is, after all, essentially just a circle with several lines in it!  Two common approaches for rendering an analog clock typically involve either manually calculating points with…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now