asked on

Xerces: Ignoring blank characters in XML Source

Hi,
I want to ignore white spaces while parsing an XML Source.

Example:
<NODE12>
<NODE21/>
<NODE22/>
</NODE12>

The Node value for TEXTNODE for NODE12, is the following ASCII chars 10-32-32-32-32-32-32
(essentailly blank charanters). How can I configure my parser to ignore these white spaces and return an empty string for the node value, or better still, return a NULL TEXTNODE.

I dont know it this is relevant, but I also require

<NODE
12>
<NODE21/>
<NODE22/>
</NODE12>

to be an illegal XML Fragment, even if I am ignoring blank spaces.

Further, if I try fetching the text of <NODE></NODE>, there is no node text. I want <NODE></NODE> to have a node text which is an empty string. At the same time <NODE/> should have a NULL node text.

The following are the details:

OS: HP 11
Parser: Xerces DOM Parser v2.2.0
Compilation: Xerces compiled on 64 bit, with aCC
Other Compiler Opts: -AA -mt

I am a novice at C++ - Xerces programming and am using Xerces to manipulate XML Sources. Any help will be greatly appreciated.

Thanks and Regards,
Praveen

BigRat

Whitespace is normally preserved since the default for xml:space is preserve.

There is a SAX method isIgnorableWhiteSpace() which you can apply to any text found to determine whether the text is white space.

Lastly the fragment you have given IS illegal irrespective of what happens to the whitespace.

HTH

pballs

ASKER

In that case, I will implement a method called/similar to isIgnorableWhiteSpace() to check. Would that be the only/most-elegant solution?

Thanks and Regards..
P;

ASKER CERTIFIED SOLUTION

BigRat

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

pballs

ASKER

Thanks HTH. That's great help.