Link to home
Start Free TrialLog in
Avatar of nhmedia
nhmedia

asked on

How to store html in SQL database and output to xml

Hi

I need to create an XML for a third party  so we can share some XML articles.
I want  to store each node in a data base field using SQL Server 2008
I will input the html with an html editor through a webform (asp.net)
I need the output to be well formed xml where special characters are properly escaped.
Do I need to encode the html when we save it?
What would be the best XML encoding to use on the feed?
We don't have a lot of experience of this so advice would be really helpful.
ASKER CERTIFIED SOLUTION
Avatar of cubaman_24
cubaman_24
Flag of Spain image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of lcohan
I suggest you choose the encoding based on your input chars and here's more detail about it:

http://www.w3schools.com/XML/xml_encoding.asp

As far as storing it and reading it from SQL I suggest you do it by using sql code like in the example below:


CREATE PROCEDURE dbo.my_XML_update
      @XMLData NTEXT,
      @myId SMALLINT
AS
SET NOCOUNT ON

-- Retrieve XML data
DECLARE @hocDoc INT

DECLARE @Comments NVARCHAR(2048)

EXEC SP_XML_PREPAREDOCUMENT @hocDoc OUTPUT, @XMLData

-- OPEN XML
-- Param1 is Pointer to DOC
-- Param2 is XML doc hiearchy
-- Param3 is Use Element Structure rather than Attributes


SELECT
      ContactId,
      ContactName
INTO
      #tmpContact
FROM
      OPENXML(@hocDoc, '/root/Contact', 2)
WITH (ContactId  INTEGER, ContactName NVARCHAR(50))

--this is how you populate the contact xml
SELECT Contact.ContactId,
            ISNULL(Contact.ContactName, '') AS ContactName
FROM #tmpContact Contact, Contact C
WHERE      Contact.ContactId = C.Id
ORDER BY      Contact.ContactName
FOR XML AUTO, ELEMENTS

-- Then you can do an INSERT
INSERT INTO Contact (Id, ContactName)
      SELECT ContactId,ContactName
      FROM #tmpContact

--or an UPDATE
UPDATE Contact SET ContactName = t.ContactName
FROM #tmpContact t
                  WHERE      contact.Id = #tmpContact.ContactId
            
-- Drop the temp Table
DROP TABLE #tmpContact

-- and MUST release XML doc to avoid memeory issues
EXEC SP_XML_REMOVEDOCUMENT @hocDoc
GO

Avatar of nhmedia
nhmedia

ASKER

Thanks, but I am not trying to store xml. I want to store text in varchar(max) fields, then output it in xml format. The text will contain html and other problem characters. These need to be escaped. I want to know if this is done when the html is outputted (I guess through the choice of encoding). Or when I save the data in my web page. Sorry if that is not clear...
Hello:
Is I said before, Tidy is capable of escaping and format html to produce well formed xml. Give it a try.
http://sourceforge.net/projects/tidynet/
You said "I want  to store each node in a data base field using SQL Server 2008" - right?
I believe my suggestion above it gives you just that by using SQL code where data is stored in a SQL table column (text, or varchar) like ContactName for instance. Good luck!
Avatar of nhmedia

ASKER

I found this worked