Convert TEXT field to XML field

I have a table with a TEXT field containing a xml. I want to convert the contents of that field to a XML field. I used CAST(tmp_text AS XML). The problem is, some of the fields do not contain a valid xml so I need to filter these out. I need something like a ISXML function like ISDATE and ISNUMERIC. How can I do this?  
LVL 3
LexieAsked:
Who is Participating?
 
Mark WillsConnect With a Mentor Topic AdvisorCommented:
Well there are about 5 or so "standard" xml replacement characters in the XML standard, but could be a whole lot more...

can check for  " < > & '    and can also check for '% & %' ....  BUT, by the sounds of it, whatever is generating the xml output is not doing the right thing with replacement characters, so could possibly assume that any instance of & is not part of the standard...

The problem doing a validation check is that it also needs to consider the entire document structure as well as the little bits. It can be very hard to do... First really need a schema so you have a basis to compare against in terms of message structrue, then there is the syntax check etc etc...

For example, if you were to double click on the XML file containing the above sample, Internet Explorer will likely be the viewer - and it is a program that is really at home with XML - but - pounds to pence the error meesage will manifest itself elsewhere and you have to track back to find the real error. That best exemplifies the kind of challenge that can lay ahead. However, if the structure is pretty simple, extremely reliable in terms of tag content and hierarchy, then can probably knock something up...

0
 
Mark WillsTopic AdvisorCommented:
Maybe check for the existance of < and > and </

could create a function...


create function uXML (@input varchar(max))
returns bit
as
begin
  if charindex(@input,'<') < 1 return 0
  if charindex(@input,'>') < 1 return 0
  if charindex(@input,'</') < 1 return 0
return 1
end

declare @str varchar(2000)
set @str = 'wehwjc c f1234 < dhsa'

if dbo.uxml(@str) > 0 print 'True' else print 'false'
0
 
LexieAuthor Commented:
The tags are all right, the problem turns out to be the character & like this in the TEXT field:
<accommodation name="Hotel Houda Golf & Beach Club"  cms-id="3871" >
 
I can filter on the & like this tra_text NOT LIKE '%&%' but this would also filter these:
<message>&gt;&gt; Allotment released - Booking on request only &lt;&lt;</message>

So I am looking for a way to validate the XML on all aspects, not only some tags.
0
 
LexieAuthor Commented:
Too bad there is not function that returns a boolean like ISDATE and ISNUMERIC.
0
 
Mark WillsTopic AdvisorCommented:
Well, at least 2008 is offering LAX validation for those optional elements in a schema - small steps, but no "isXML" might try to write one... Let me know if you need anything more.

Cheers,
Mark Wills
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.