[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Validating XML file by using xerces-C

Posted on 2006-04-28
6
Medium Priority
?
1,013 Views
Last Modified: 2013-11-19
Hi,

1. I would like to do a simple validation my xml input file if there are any xml syntax error, ex:missing clossing tab. Example xml input file as below:

<?xml version="1.0" encoding="utf-8" ?>
<group name=A>
  <members>
    < member name="ha"   age="22"/>
    < member name="he"   age="25"/>  
  </members>
</group>


I try to turn on the validation with code below. However,  it seems not working even i purposely remove  </member> and no error message prompt out. What is the correct way to do it?


      parser_.setValidationScheme( xercesc::XercesDOMParser::Val_Always ) ;
      parser_.setDoNamespaces( true ) ;
      parser_.setDoSchema( true ) ;
      parser_.setLoadExternalDTD( true ) ;

2. I would like to validate my input file to make sure user didn't provide extra information even though input file have correct xml syntax.

desired xml input file:
<?xml version="1.0" encoding="utf-8" ?>
<group name=A>
  <members>
    < member name="ha"   age="22"/>
    < member name="he"   age="25"/>  
  </members>
</group>

if user provides something as below, i need to flag error;
<?xml version="1.0" encoding="utf-8" ?>
<group name=A>
  <members>
    < member name="ha"   age="22"  gender="female"/>  => extra information that no needed, flag error
    < member name="he"   age="25"  gender="male"/>  
  </members>
</group>
 

How should I code the xerces-C to do this validation?


thanks
 
0
Comment
Question by:pupuboo
  • 3
  • 3
6 Comments
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 1200 total points
ID: 16568802
Hi pupuboo,
> parser_.setLoadExternalDTD( true ) ;

for validation you need a DTD or a schema
you don't seem to have one

here is a simle DTD for your needs

<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT  group  (members)+ >
<!ATTLIST group
     name  CDATA #REQUIRED  >
<!ELEMENT  members (member)+ >
<!ELEMENT  member EMPTY >
<!ATTLIST member
     name  CDATA #REQUIRED  
     age  CDATA  #REQUIRED  >

if you save that in a file (eg. members.dtd) then you need to reference it like this in the XML file

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE group SYSTEM "members.dtd">
<group name="A">
    <members>
<member name="ha"   age="22"/>
<member name="he"   age="25"/>  
  </members>
</group>

no have this in your code
     parser_.setValidationScheme( xercesc::XercesDOMParser::Val_Always ) ;
     parser_.setLoadExternalDTD( true ) ;

some observations with your XML

you can't have a space in front of the element name in a start-tag (as you have < member>
attribute values need to be in quotes (group=A is not allowed

cheers

Geert
0
 

Author Comment

by:pupuboo
ID: 16601282
Hi thanks for the reply. It helps.
And I found that I need to add ErrorHandler class to enable the error is report out.

However, I have one more question:

If I need to add restriction to the attribute data. Example, member name only can be "ha", "he", "ho" and  member age only can be between 18 to 30.

How should I write the DTD?

thanks
pupuboo
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 16609096
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT  group  (members)+ >
<!ATTLIST group
     name  CDATA #REQUIRED  >
<!ELEMENT  members (member)+ >
<!ELEMENT  member EMPTY >
<!ATTLIST member
     name  (ha | he | ho )  #REQUIRED  
     age  (18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 ) #REQUIRED  >

the age is a bit clumsy, but DTDs don't have a notion of datatypes...
so you have to list possible string values

it is different for XML schema, but that adds complexity.
If you can live with the clumsyness of this DTD,
leave it as it is

cheers

Geert
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:pupuboo
ID: 16632761
Hi,

Sorry, it make me think of another question

How about if condition have to pair as below?:

for name = "ha", age must only between 16~18
for name = "he", age must only between 18~20
for name = "ho", age must only between "twenty-one ~ twenty-two" (must in character instead of digit)

will be easier by using DTD or using xml schema?


thanks :)
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 16634290
Hi,

this is a so-called co-occurence constraint.
You can't express that using DTD, but you can't express that in W3C schema either

this is actually a constraint that requires a schematron schema
for that you need an extra validation step (in another layer)

you can also check this constraint in XSLT,
validate with the DTD first... and then use an XSLT to check the remaining constraints

cheers

Geert
0
 

Author Comment

by:pupuboo
ID: 16661286
thanks for the answer :)
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preface This article introduces an authentication and authorization system for a website.  It is understood by the author and the project contributors that there is no such thing as a "one size fits all" system.  That being said, there is a certa…
What is Node.js? Node.js is a server side scripting language much like PHP or ASP but is used to implement the complete package of HTTP webserver and application framework. The difference is that Node.js’s execution engine is asynchronous and event…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question