Xerces C++ API - Parser problems

Posted on 2003-11-04
Last Modified: 2013-11-19

Hello everybody,

I have a problem with the parser and I am not sure whether it is a bug or I am doing something wrong.

The XML Schema defines a field like this:

<xsd:element name="some-field" type="xsd:positiveInteger" />

Inside the XML file that I need to parse, there are cases when this field is present or is absent.
When the field is completely absent, the parser raises an error. However, when I have a tag like this
<image-number/> the parser doesn't raise the error. It parses the message as if everything is fine. This causes me problems because according to the schema, I expect there a value.

Now, in the schema (as it can be seen) there is no provision for the field to be nillable.

For that, the element should have been defined like this:
<xsd:element name="some-field" type="xsd:positiveInteger" nillable="true"/>

Is this a bug or that is the way the parser is suppose to behave? I would have expected the parser to catch the case when the value of the field is not present.

Any help would be very much appreciated.
Question by:Mensana
  • 4
  • 3

Accepted Solution

savalou earned 250 total points
Comment Utility
The schema says that you need to have a "some-field" element.  If you could omit it, it would have a minOccurs="0" attribute.  

So when you have the element, even if it's content is empty, the parser is happy.  

If you have a nillable="true" attribute in the schema definition, then you can explicitly set a nill flag to specify that you have a null value.  Look here for more info on that:

So I don't think the parser is misbehaving, sorry to say.

Author Comment

Comment Utility


I understand what you're saying and I'll give you the points for your effort to reply me. What I don't understand is the meaning of declaring an element as "nillable". Why would even bother saying that "some-field" element is nillable when you always can put it like this:


instead of


and you'll achieve the same thing without having the parser complain about "nill-ness" that doesn't verify the XML Schema.

Thanks again,

Expert Comment

Comment Utility
Sometimes there's a difference between a blank and a null value.  A null could mean that the value was never assigned but a blank could mean that is the value.  I wish I had a good example for you but I can't think of anything meaningful.  But I've encountered the situation and in an IT world it makes sense to be able to distinguish between the two.

Author Comment

Comment Utility

What you say makes sense. My problem comes from the fact that now I have to check for every element whether it has a value or not. Before I relied on the parser to find errors in the XML message, but it appears that this is not reliable at all.
To understand what I am saying, here is a short example:

// before I had it like this

// get the document
DOMDocument *pXMLDomDocument = pDOMParser->getDocument();

// get the root of the document
DOMElement *pRoot = pXMLDomDocument->getDocumentElement();

// ... navigate to the node I wanted to read
DOMNode *pNode = MyFunction2FindANode( pRoot, "Label_of_Node" );

// get its value
DOMText *pTextNode = MyFunction2FindAValue( pNode );

// extract the value and do something with it
CString strElementValue = pTextNode->getData();

I would only check to see whether the pTextNode pointer is NULL for those elements that were "nillable":

// ... same as above

// get its value
DOMText *pTextNode = MyFunction2FindAValue( pNode );

// extract the value and do something with it
CString strElementValue = "";
if( pTextNode )
   strElementValue = pTextNode->getData();

Now I need to do it everywhere because the parser will not trigger an exception. Since I would have to do this check in a lot of places (which will considerably slow down the process), I thought that there is another solution.

Anyway, here are the points for you. Thanks again for taking the time to answer my question.
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.


Expert Comment

Comment Utility
I take it back.  I wrote without testing.  The parser should make a noise.  Are you sure you've got validation on?  I know more about the Java parser than C++, but I did a test using the SAXPrint executable that ships with Xerces C++ v2.3 and when I have an element with empty content but which is declared
   <xsd:element name="zip"    type="xsd:positiveInteger"/>
in the schema, it says:

  Message: Datatype error: Type:InvalidDatatypeValueException, Message:Value ''
does not match regular expression facet '[+\-]?[0-9]+'.

Maybe you can run your instance document through this and see what happens?

Author Comment

Comment Utility
OK, your message is a "heads-up" one for me. Here is the function that creates the parser:

XercesDOMParser *CreateValidatingParser( const XMLCh *schema )
    XercesDOMParser *parser = new XercesDOMParser;
    parser->setValidationScheme( XercesDOMParser::Val_Always );
    parser->setDoNamespaces( true );
    parser->setDoSchema( true );
    parser->setValidationSchemaFullChecking( false );
    parser->setExternalNoNamespaceSchemaLocation( schema );
    return parser;

The only thing that could affect the behavior would be to set the full checking:

parser->setValidationSchemaFullChecking( true );

In the documentation ( is said that:

This method allows the user to turn full Schema constraint checking on/off.

Only takes effect if Schema validation is enabled. If turned off, partial constraint checking is done.

Full schema constraint checking includes those checking that may be time-consuming or memory intensive. Currently, particle unique attribution constraint checking and particle derivation resriction checking are controlled by this option.

The parser's default state is: false.

Do you suppose this would help? I am going to try it.
Thanks again.

Author Comment

Comment Utility
Hey, I tried to set the Full Checking flag and it didn't help. The parser went ahead and processed the whole file wothout errors. I am still searching...

Expert Comment

Comment Utility
In C++ you should use the custom ErrorHandler class (see DOMCount sample),
or if you are not interested in what the error is, you can simply use the
getErrorCount() method of the XercesDOMParser class (in case of errors
it should return value greater than 0)
Hope this helps.
Sorry for my bEd Anglish :)

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

This article covers the basics of the Sass, which is a CSS extension language. You will learn about variables, mixins, and nesting.
Styling your websites can become very complex. Here I'll show how SASS can help you better organize, maintain and reuse your CSS code.
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now