• C

XML validation using libxml2 library

Hi,

I am using libxml2 library to parse a XML file on my box.
Every thing seems to work fine with libxml2 except for validation of the XML file passed.

Basically when you have a valid XML file as below all APIs work fine.

<Identity>
    <name>Jinu</name>
    <age>23</age>
</Identity>

But when the XML is corrupted as below, the xmlParseFile() API just crashes.

<Iden>tity>
    <name>Jinu</name>
    <age>23</age>
</Identity>

Is there a way to validate the XML file either using libxml2 or otherwise?

Hope libxml2 library is famous enough in this part of the world.. :)

-Jinu
jinumjoyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Julian HansenCommented:
Um, sounds like you are looking for an AI solution.

If the file is corrupt, unless you know something about the corruption there is not much you can do to fix it. In this case you are asking libxml2 to know that an > is in the wrong place and that it must remove it.

Machine intelligence is not quite there - best you can do is find out where the corruption occured so you can fix it manually - there must be tools out there that can do that for you. I usually use IE or Excel (;) it gives you a linenumber) and let these report the error and then go and manually fix.

0
jinumjoyAuthor Commented:
Hi Julian,

libxml2 basically is a XML parsing library distributed with most *inx flavours. So what i am mentioning is no AI for it!!!! thats its job.. BTW i only want some routine that can return me some failure status on corrupt XML file, so that i can take some action.

-Jinu
0
baboo_Commented:
Hi Julian:

As far as I know, xmlParseFile *does* know if the xml file is well-formed or not.  According to the API, it returns NULL if the file is not well-formed.  Is it xmlParseFile that's crashing?  Or is it the code you wrote that uses xmlParseFile?  My apologies if I'm stating the obvious, but a simple check like this:

xmlDocPtr doc;
doc = xmlParseFile("test.xml");

if (doc == NULL)
       printf("error: could not parse file\n");

Should take care of it...  

Here's the thing:  The interface of xmlParseFile states that it returns an xmlDocPtr if the doc is well-formed, and NULL if it's not.  If the xmlParseFile method itself is crashing, then that's a problem with the implementation, and you should probably try to find an updated distribution (or at least email the creators so they're aware).

Hope this helps!

baboo_
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

baboo_Commented:
Hi Jinu, I meant.  oops.  Sorry...  Well, 'hi' to you, too Julian...

baboo_
0
Julian HansenCommented:
and hullo to you to baboo_ ;)

Just to clarify - I thought Jinu was asking for a way to get libxml2 to parse the file and extract any valid data and just bypass any corruptions. The point I was trying to make was that if the file is corrupt then until you have fixed the corruption the library will not read the file for you i.e. it wont make decisions about how to deal with the corruption and continue processing ... just in case anybody hadn't realised that already ....

I also thought throwing my two year old up in the air and catching him was a good idea and now I have a back problem.

You see, this is what happens when one's brain decides to override what one's eyes are seeing and starts to making up its own reality.

Sorry Jinu - misread the question - I will keep very quiet now because now that I have read your question properly I don't actually have an answer for you - but luckily it looks like baboo_ does ... ;)
0
jinumjoyAuthor Commented:
Hi baboo,

Found out the problem. Basically my piece of code is running under xinetd. The xmlParseFile() API is got no problem it does return NULL when corrupt XML is passed to it. But it also prints an error message on the console. Printing on console does not make sense when you are a service under xinetd (stdin and stdout are mapped to socket descriptor when running under xinetd). I have got to find a way to disable this console error logging in libxml2. I would prefer the API gives me the control to handle errors based on the return value.

Does any one know how to disable the console error reporting in libxml2?

-Jinu
0
baboo_Commented:
Hmm...  I found this in the API.  You could try this, but it's not 100% clear if xmlGenericErrorFunc will catch parsing errors.  If it did work, all you'd need to do would be to write your own xmlGenericErrorFunc with the proper function header, and then "register it" as the default.  Well, anyways, look here for the appropriate parts of the API:

http://www.xmlsoft.org/html/libxml-xmlerror.html#initGenericErrorDefaultFunc
http://www.xmlsoft.org/html/libxml-xmlerror.html#xmlGenericErrorFunc

baboo_
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
baboo_Commented:
a-HA!!  Indeed, it is the case that using initGenericErrorDefaultFunc does this...  Here's someone else with the same problem who solved it this way:

http://mail.gnome.org/archives/xml/2004-May/msg00202.html

baboo_
0
jinumjoyAuthor Commented:
Hi baboo,

have u tried using the function????? adding the error handler throws up segmentation fault.
This is how my error handler looks like.

void myerrorhandler(void *cts, const char *msg, ...)
{
          return;
}


registration is done as follows...
initGenericErrorDefaultFunc((xmlGenericErrorFunc *)&myerrorhandler);

anything wrong with what i am doing???

-Jinu
0
jinumjoyAuthor Commented:
hey baboo,

Thanks, I got it working. Used xmlSetGenericErrorFunc() to register the error handler.

Thanks,
-Jinu
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.