• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1073
  • Last Modified:

How to handle malformed XML in EXTRACT called from SQL

I have a query EXTRACTing values from standalone UTF-8 xml in a varchar2 field e.g.

select (extract(xmltype(tr.txn_raw_data), '/FEE/FEEID/text()').GetStringVal()) as Foo

it falls over due to malformed XML in some rows. How do I handle this? I'm looking for some way of testing the contents of the column for wellformedness so that I can write something like:

select (case when >>isMalformed<<(...) then 'XXXX' else extract(....) end) as Foo

It has to be inlined code because I only have Select access to the DB.
0
monosodiumg
Asked:
monosodiumg
  • 3
  • 3
1 Solution
 
sdstuberCommented:
I don't think you can, extract requires the input to be well formed, that's one of it's assumptions.  You'll have to preprocess it somehow.

You could extract the parent node and then use some combination of substr, instr and other text functions to check the structure before processing the child, but that's not going to be efficient at all.

You could try querying twice, first for real, then a second dummy query if necessary,
If you get an error due to the doc being malformed then select 'xxxx' from dual instead of your normal "Foo", if Foo returns successfully, then just use whatever it returns.
0
 
monosodiumgAuthor Commented:
I understand extract require wf input, which is why I'm looking to test the input first and only pass it to extract if it is. That's the way i've structured the example.

Foo is the column alias not the source table. I do not understand what you are proposing. It seems to involve error trapping then rerunning the query. As I  stated, I can only operate within SQL, not PL/SQL.

Isn't there any function I can call on a string that will tell me if that string is well-formed xml or not?

The instr and substr approach is very much the last resort.
0
 
sdstuberCommented:
I'm not aware of any stand-alone function that will take text and check for well-formed xml content.

You don't need to compile a procedure, you can use an anonymous block of pl/sql in sql*plus, or if you are using vb, c#, perl, java, etc.  whatever language you use should be able to capture an exception and issue a second query.

There are pl/sql procedures to check for well-formed (and/or dtd/xsd validation as well) but those won't be usable if you can't use pl/sql at all.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
monosodiumgAuthor Commented:
Used a replace to turn the naked ampersands into the proper character entities.
0
 
sdstuberCommented:
oh, I didn't know you were looking for something that simple.  From the question explanation, it seemed like you were looking for something more open ended.
0
 
monosodiumgAuthor Commented:
I was looking for something smarter. My solution deals with ampersands only and not at all intelligently. I'd give my answer a D grade if there were one. Your attempts would have been relevant if I were working in some host language but all I can do is SELECT and there's no error handing within that (that I know of). Nor is there any function that will tell me if a supplied varchar2 is well-formed XML or not.
0

Featured Post

NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now