We help IT Professionals succeed at work.

How to handle malformed XML in EXTRACT called from SQL

monosodiumg
monosodiumg asked
on
1,118 Views
Last Modified: 2013-12-19
I have a query EXTRACTing values from standalone UTF-8 xml in a varchar2 field e.g.

select (extract(xmltype(tr.txn_raw_data), '/FEE/FEEID/text()').GetStringVal()) as Foo

it falls over due to malformed XML in some rows. How do I handle this? I'm looking for some way of testing the contents of the column for wellformedness so that I can write something like:

select (case when >>isMalformed<<(...) then 'XXXX' else extract(....) end) as Foo

It has to be inlined code because I only have Select access to the DB.
Comment
Watch Question

Sean StuberDatabase Developer & Administrator
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2012

Commented:
I don't think you can, extract requires the input to be well formed, that's one of it's assumptions.  You'll have to preprocess it somehow.

You could extract the parent node and then use some combination of substr, instr and other text functions to check the structure before processing the child, but that's not going to be efficient at all.

You could try querying twice, first for real, then a second dummy query if necessary,
If you get an error due to the doc being malformed then select 'xxxx' from dual instead of your normal "Foo", if Foo returns successfully, then just use whatever it returns.

Author

Commented:
I understand extract require wf input, which is why I'm looking to test the input first and only pass it to extract if it is. That's the way i've structured the example.

Foo is the column alias not the source table. I do not understand what you are proposing. It seems to involve error trapping then rerunning the query. As I  stated, I can only operate within SQL, not PL/SQL.

Isn't there any function I can call on a string that will tell me if that string is well-formed xml or not?

The instr and substr approach is very much the last resort.
Database Developer & Administrator
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2012
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
Used a replace to turn the naked ampersands into the proper character entities.
Sean StuberDatabase Developer & Administrator
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2012

Commented:
oh, I didn't know you were looking for something that simple.  From the question explanation, it seemed like you were looking for something more open ended.

Author

Commented:
I was looking for something smarter. My solution deals with ampersands only and not at all intelligently. I'd give my answer a D grade if there were one. Your attempts would have been relevant if I were working in some host language but all I can do is SELECT and there's no error handing within that (that I know of). Nor is there any function that will tell me if a supplied varchar2 is well-formed XML or not.
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.