Problem converting DTD to XML schema

Hi gurus,

I have a hard time converting the following DTD definitions to XML schema. First, the DTD(only the portion I have problems with):

<!ELEMENT OtherServer (URL | MSISDN | (URL, MSISDN))>
<!ELEMENT URL (#PCDATA)>
<!ELEMENT MSISDN (#PCDATA)>

Below is what XMLSpy or XMLWriter gives me when I use them to convert to XML schema(only the portion I have problems with):

        <xsd:element name="OtherServer">
               <xsd:complexType>
                       <xsd:choice>
                               <xsd:element ref="URL"/>
                               <xsd:element ref="MSISDN"/>
                               <xsd:sequence>
                                       <xsd:element ref="URL"/>
                                       <xsd:element ref="MSISDN"/>
                               </xsd:sequence>
                       </xsd:choice>
               </xsd:complexType>
       </xsd:element>
      <xsd:element name="MSISDN" type="xsd:string"/>
      <xsd:element name="URL" type="xsd:string"/>

This sounds simple enough. However, in actual usage, this fails. My test code is like the following:

SAXBuilder xmlParser = new SAXBuilder("org.apache.xerces.parsers.SAXParser", true);
xmlParser.setFeature("http://apache.org/xml/features/validation/schema", true);
xmlParser.setProperty("http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation", "WV-VERDISC.xsd");
// xmlStr holds the XML data
StringReader rdr = new StringReader(xmlStr);
Document xmlDoc = xmlParser.build(rdr);

Sample XML(only the portion I have problems with):

   <OtherServer>
      <URL>http://www.blah.com/XMLTest</URL>
      <MSISDN>+1234567890</MSISDN>
   </OtherServer>
   <OtherServer>
      <URL>http://www.blah.com/XMLTest</URL>
   </OtherServer>
   <OtherServer>
      <MSISDN>+1234567890</MSISDN>
   </OtherServer>

So, looking at the DTD, the above 3 OtherServer XML elements are all valid. But using the converted XML schema, during validation, if OtherServer has both URL and MSISDN child elements, the parser says it is NOT valid. After many trials and errors, I am wondering if the DTD definition can be converted to XML schema at all. I tried group declarations and it won't work. I am using  JDOM 1.0 and Xerces 2.6.2. Any clues is greatly appreciated! TIA!
AndyfungklAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DitmarBehnCommented:
Hi,

it seems as if the xsd:choice is not the correct way of handling such a situation. Choice is really an either/xor without the possibility of having multiple/all child nodes available.

Try it with the following xsd snippet and see if it fits your need:

<xs:sequence>
  <xs:element name="OtherServer" maxOccurs="unbounded">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="URL" type="xs:anyURI" minOccurs="0"/>
        <xs:element name="MSISDN" type="xs:int" minOccurs="0"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:sequence>

Regards, Ditmar
0
rdcproCommented:
Hmmm...I don't see anything at all wrong with the OP's schema.  It validated fine for me, but I'm not using JDOM or Xerces.  Essentially is says a choice of either A, or B or both.    

The problem with using an xs:sequence is that it allows this:

<OtherServer>
</OtherServer>

Which is not allowed with the DTD.  If an empty OtherServer is allowed, then this works.

I'm wondering, too, if there is a need to constrain the order of the URL and MSISDN elements.  If you don't want to constrain the order of the child elements of OtherServer, then use the xs:all declaration:

  <xs:element name="OtherServer" maxOccurs="unbounded">
    <xs:complexType>
      <xs:all>
        <xs:element name="URL" type="xs:anyURI" minOccurs="0"/>
        <xs:element name="MSISDN" type="xs:int" minOccurs="0"/>
      </xs:sequence>
    </xs:all>
  </xs:element>

But again, this allows an empty OtherServer element.

Regards,
Mike Sharp
0
AndyfungklAuthor Commented:
Hi Mike,

Which XML parser are you using just curious? I am subjecting it is a problem with JDOM + Xerces too, since I don't see anything wrong as well.

Regards,
 Andy
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

rdcproCommented:
I did the validation using XML Spy, Ver 5, Release 4.  While I have found obscure validation bugs in XML Spy, this seems like a pretty common content model to me.  In fact, it's weird that JDOM and Xerces would have problems with it.  I wonder if something else might be the problem...

Regards,
Mike Sharp
0
Geert BormansInformation ArchitectCommented:
Hi there,

just to add my two cents.

The problem is that basically both the schema and the DTD are unvalid.
Old SGML parsers that are turned into XML parsers would spot that the DTD content model is ambiguous.
To explain this in a simple fashion: if you open an <URL> element, the parser doesn't know if the branch <URL> or the branch <URL><MSISDN> opens without looking ahead.
If you then build your parser as a finite state machine, it will require lookaheads... which you don't want, it is slow...

The problem with DTDs is that parsers are not obliged to report this... most modern DTD validating parsers DON'T.

If you try to rewrite your DTD as such
<!ELEMENT OtherServer ( MSISDN | (URL, MSISDN?))>
It is no longer ambiguous and it is equivalent to what you mean.

The transformation of the schema as originally posted is equivalent to the also posted DTD snippet.
on <xs:element> the MinOccurs attribute is defaulted "1".
So in my mind the story about <OtherServer></OtherServer> being allowed is not correct
(please note that XML Spy doesn't always do the right thing with schema validation, Xerxes usually does)

If I parse the original snippet, my parser says (correctly) that it breaks the Unique Particle Attribution principle
(http://www.w3.org/TR/xmlschema-1/#cos-nonambig)
Schema validating parsers HAVE to report this. This is why you never found the problem with the DTD, only with the schema

My non-ambiguous DTD alternative translates in the following schema

 <xs:element name="OtherServer">
    <xs:complexType>
      <xs:choice>
        <xs:element ref="MSISDN"/>
        <xs:sequence>
          <xs:element ref="URL"/>
          <xs:element minOccurs="0" ref="MSISDN"/>
        </xs:sequence>
      </xs:choice>
    </xs:complexType>
  </xs:element>
  <xs:element name="URL" type="xs:string"/>
  <xs:element name="MSISDN" type="xs:string"/>

This schema works with your examples.
I guess this closes the subject :-)
Have a nice evening

Gertone
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Geert BormansInformation ArchitectCommented:
well,
If it helps you to make a recommendation...
I am convinced that the question was answered correctly and completely :-)
I don't care about the points, but I would hate the solution being removed from the archive.

To quote my parser on the DTD:
Markup Error (0004) on line 2 in file Markup Stream:
A content model must not be ambiguous.
For the declared element "OtherServer", the element "URL" is ambiguous
in the content model.

Gertone
0
rdcproCommented:
I agree.  Points to Gertone.

Regards,
Mike Sharp
0
AndyfungklAuthor Commented:
Sorry for not checking experts-exchange for awhile, anyway Gertone really nails the problem. Thanks Gertone and rdcpro both of you!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Languages and Standards

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.