Solved

XSD - One or the other or both of them...

Posted on 2004-04-07
14
444 Views
Last Modified: 2013-11-19
Well - I'm to design this XML-schema and ran it something which seems all backwards when solved.

I got to elements let us call them "EarningsUSD" and "EarningsEURO". Following are 3 snippets of the xml-instans file which all much validate against my schema-definition.

SNIPPET 1:
    <someVar>123</someVar>
    <EarningsUSD>300</EarningsUSD>
    <anotherVar>Kung Fu Is</anotherVar>

SNIPPET 2:
    <someVar>123</someVar>
    <EarningsEURO>300</EarningsEURO>
    <anotherVar>Kung Fu Is</anotherVar>

SNIPPET 3:
    <someVar>123</someVar>
    <EarningsUSD>300</EarningsUSD>
    <EarningsEURO>300</EarningsEURO>
    <anotherVar>Kung Fu Is</anotherVar>

SNIPPET 4 (must NOT validate):
    <someVar>123</someVar>
    <anotherVar>Kung Fu Is</anotherVar>


I solved the problem by doing:
   <xs:element name="someVar"/>
   <xs:choice>
      <xs:sequence>
          <xs:element name="EarningsUSD"/>
          <xs:element name="EarningsEURO"/>
      </xs:sequence>
      <xs:sequence>
          <xs:element name="EarningsUSD"/>
      </xs:sequence>
      <xs:sequence>
          <xs:element name="EarningsEURO"/>
      </xs:sequence>
   </xs:choice>
   <xs:element name="anotherVar"/>

Now that leaves me with 2 occurences of both EarningsUSD and EarningsEURO, meaning to places to maintain if definition changes.

I would have thought that something like the snippet below would do the same trick but it doesn't:
  <xs:choice minOccurs="1" maxOccurs="2">
      <xs:element name="EarningsUSD" minOccurs="0" maxOccurs="1"/>
      <xs:element name="EarningsEURO" minOccurs="0" maxOccurs="1"/>
  </xs:choice>

So if anybody know a better way (a simplistic way) than the sequences-in-choice method to do "Select one, the other or both, but never none of them" then please please P-L-E-A-S-E HEEEEEEELP

/GoJoe
0
Comment
Question by:gojoe
  • 6
  • 3
  • 3
  • +1
14 Comments
 
LVL 15

Expert Comment

by:dualsoul
ID: 10773420
no, only method i know is complete enumeration of al posible cases.
Like you did.

You are lucky to have only 2 elements with 3 case :))
0
 
LVL 3

Author Comment

by:gojoe
ID: 10773590
Lucky - I feel like kicking somebody hard, very hard, exactly where it hurt the most. Those somebody might be the culprits who did not think this was important enough to be part of the "XML Schema Recommendation".

If others know about a solution to my problem, I'll promptly withdraw the harsh remarks but till then - shame on those %¤#¤%¤% culprits.

And BTW I've got it a little worse than the first post implies, the same problem spreads out through the whole damn schema definition.


/GoJoe


0
 
LVL 2

Expert Comment

by:jeffqyzt
ID: 10775939
You have to define a sequence for the parent of this node that has an unbounded choice for all of the "at least one" options.  See below (I arbitrarily called the parent "root".)  By being part of the sequence, it will require at least one, and by being unbounded, it will allow (n) of them to occur.  What it will *not* disqualify are repetitions of the same element (i.e. if you had two elements named "EarningsUSD" it would still pass validation.)

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
   <xs:element name="EarningsEURO" />
   <xs:element name="EarningsUSD" />
   <xs:element name="anotherVar"/>
   <xs:complexType name="CurrencyParent">
      <xs:sequence>
         <xs:element ref="someVar"/>
         <xs:choice maxOccurs="unbounded">
            <xs:element ref="EarningsUSD"/>
            <xs:element ref="EarningsEURO"/>
         </xs:choice>
         <xs:element ref="anotherVar"/>
      </xs:sequence>
   </xs:complexType>
   <xs:element name="root" type="CurrencyParent"/>
   <xs:element name="someVar" />
</xs:schema>

Does this do what you need?
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10776558
To gojoe, I would say that if your schema definition is changing all that often, updating the schema will be the least of your worries.  The larger problem will be updating the applications that depend on it.  

If the design pattern you're describing exists in multiple places in the schema, and they may all be modified as in the case that you're intending to add currencies in the future to the CurrencyParent type, you might be interested in this design pattern for Schemas:

http://www.xfront.com/eXtreme-eXtensibility.html

 
Regards,
Mike Sharp
0
 
LVL 15

Expert Comment

by:dualsoul
ID: 10777854
jeffqyzt,

it's interesting idea, but this snippet would be valid, but i think it shouldn't

.......................................................
       <someVar>123</someVar>
      <EarningsEURO>300</EarningsEURO>
      <EarningsEURO>300</EarningsEURO>
      <EarningsEURO>300</EarningsEURO>
      <anotherVar>Kung Fu Is</anotherVar>
.......................................................

0
 
LVL 2

Expert Comment

by:jeffqyzt
ID: 10783024
True (and I pointed out in the prefatory comment to my post.)  However it does meet the criteria of the Asker's "Select one, the other or both, but never none of them".  It's not perfect, but it may need their needs, and it is somewhat simpler than the more fully functional method they began with (especially as the number of choices begins to rise above 2; imagine the case above, but now throw in EarningAUS, EarningsJAP, EarningsSAR, etc :-)
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10785795
The problem the OP is running into isn't really a problem with Schema, its a problem with his data model and how his grammar represents it.  This seems to me to be a classic case of putting data in a tag name.  Data, or meta data, belongs elsewhere (ie: in an attribute or in element content).  

There is a difference between EarningsAUS and EarningsEURO, and not because they're fundamentally different types of content.  I would say the difference should be captured in metadata.  The difficulty that the OP is encountering is typical of the situation where you use the tag name to capture what *should* be meta data.  As the vocabulary rises, maintenance increases, and it becomes more difficult to validate.  For example, how would you (in the general case) get all of the earnings for a given currency?  You'd have to either enumerate every possibly tagname, or you'd have to do some sort of string function on the tag name.  Not a good situation.

The data model should be:

<Earnings currency="Euro">

This is far easier to validate and to maintain, because now the allowed values for currency are an enumeration for a single attributeType.  This could be in a component schema, so that adding additional currencies involves adding a single line to the enumeration.

I say that "Euro", "USD", etc are metadata, and shouldn't be part of the vocabulary of the XML.

Regards,
Mike Sharp
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 2

Expert Comment

by:jeffqyzt
ID: 10786250
Agreed, wholeheartedly.  However, the poster seems to be working backward from a fixed XML structure, as opposed to being able to dictate the structure/design of the XML.  By all means, if the structure of the source XML can change, then it should.  







0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10786452
Well, sometimes a bad design needs to be refactored...while they still can.  The problem is only going to get worse.  If the source XML cannot be changed (ie: a vendor with a mouth, but no ears), then another possibility is to transform the XML into something that can be worked with.  That still involves a bit more maintainance, but not as bad as the alternative.

My experience is when I hear "This XML cannot be changed" what's happening is simple stonewalling.  As you pointed out, the difficulty in maintaining the existing data model becomes geometrically more difficult as new currencies are added.

Regards,
Mike Sharp
0
 
LVL 15

Expert Comment

by:dualsoul
ID: 10798306
>and I pointed out in the prefatory comment to my post.
oh...i missed it.  please sorry.


> Well, sometimes a bad design needs to be refactored...while they still can
it always should be refactored, this is what the refactorfor - to improve design.

I think if the author need to build Schema, so the XML structure not fixed. Or why he want the Schema? :) Or maybe it can be converted via XSLT to another , more suitable format. And this can solve the problem...may be.
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10801210
A good point!  Unless the goal is to simply validate existing XML.  But if so, I concur with the suggestion of XSLT tranforming it to another format.

Regards,
Mike Sharp
0
 
LVL 3

Author Comment

by:gojoe
ID: 10821370
Thanks for all the suggestions and after carefull consideration here is some feedback:

jeffqyzt:
As pointed out by dualsoul your suggestion allows multiple occurences of the same element which it should not.

Further more your schema has a serious drawback in the fact that the following would validate
<EarningsEURO>123</EarningsEURO>

Placinging any element but the desired documentElement at root-level is a definitly no-no.


Mike (rdcpro):
I see were you are going, when using attributes - but i doesn't solve the problem with "one, the other or both but never none of them".

The problem is that one of the two elements when it occurs must occur with one, two or three sibling-elements in a series of three elements (same problem as the original one).

I didn't post it in the original post since as stated the problem is identical, but what i really need is:
<xs:element name="someVar"/>
<xs:choice>
  <xs:element name="EarningsEURO"/>
  <xs:sequence>
     <xs:element name="EarningsUSD"/>
     <xs:choice>
         <xs:element name="sib1ToUSD"/>
         <xs:element name="sib2ToUSD"/>
         <xs:element name="sib3ToUSD"/>
     </xs:choice>
  </xs:sequnce>
</xs:choice>

Never mind the names of the elements, it's bogus names, but the fact remains that this is not possible in any easy way with xml-schema.

Your also suggests using "eXtreme eXtensibility" but we really need to restrict the xml and secondly moving outside the xml-schema is not an option since all our incoming xml is validate in XMLMediator - and this does not suppport anything outside off the standards.

It's not that the design of the schema is going to change on a monthly basis or anything in that neighborhood. I simply wanted to know if I had over looked some clever feature in xml-schema which would allow me to do what I wanted in an easier way.

Now no more caring about this problem now - I need to leave the PC and enter the outside world - it's such a nice weather (and I'm lucky enough to still be on easter-holliday :) - so if you're working or having bad weather near you cheer up - and remember somewhere the weather is SO GOOOOD...


Regards,
/GoJoe
0
 
LVL 26

Expert Comment

by:rdcpro
ID: 10826211
Well, my suggestion to refactor the XML, and put the metadata in attribute content allows you to specify identity constraints, which are essentially the part that says you can't have "more than one".  So the enumeration provides the allowable set, and the identity constraint prohibits duplicates.  See:

http://www.w3.org/TR/xmlschema-1/#cIdentity-constraint_Definitions

For an excellent tutorial, go to Roger Costello's site:

http://xfront.com

and download his Schema Tutorial.  The section you want is in PPT deck #2, beginning with slide 65.  

The idea is that you specify a uniqueness scope (<someVar>) by putting the uniqueness constraint in the element definition you want to constrain.  For example:


<xs:element name="someVar">
<xs:choice>
  <xs:element name="Earnings" minOccurs="1" maxOccurs="unbounded"><!-- This must be unique within the scope of someVar -->
    <xs:sequence>
       <xs:element name="EarningsUSD"/>
       <xs:choice>
           <xs:element name="sib1ToUSD"/>
           <xs:element name="sib2ToUSD"/>
           <xs:element name="sib3ToUSD"/>
       </xs:choice>
    </xs:sequence>
    <xs:attribute ref="currencyType" name="currency" use="required" />    
  </xs:element>
</xs:choice>
<xs:unique name="UNIQ"><!-- this says that the element Earnings must be uniquely defined by the currency attrribute  -->
    <xs:selector xpath="prefix:Earnings"/>
    <xs:field xpath="@currency"/>
</xs:unique>
</xs:element>

With a global attribute (perhaps maintained in a separate file, since this is the one that will change:

<xs:attribute name="currencyType">
       <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:enumeration value="USD"/>
                <xs:enumeration value="EURO"/>
                <xs:enumeration value="YEN"/>
            </xs:restriction>
       </xs:simpleType>
</xs:attribute>

Now, the above may not be *exactly* correct, as I just typed it in--I haven't had time to set up a test case--but this should be enough to get you on your way.  I don't think you could do this if the scope of the element names is potentially unlimited.  Using this approach, all you need to do is add new currencies to the enumeration, and *most importantly* it wouldn't break existing applications.

Regards,
Mike Sharp
0
 
LVL 26

Accepted Solution

by:
rdcpro earned 500 total points
ID: 10826260
Oh, and by the way... "those %¤#¤%¤% culprits" did think about this!  ;^)

About the only thing you can't do is specify "if this element contains an <earnings currency="EURO">, then that other element must contain <earnings currency="foo">.   This type of concurrency constrain can be specified in Schematron, though.

Regards,
Mike Sharp
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Help with SimpleXML charts with PHP 4 65
How do i send an api key in a request header 3 44
c# code 19 61
Stupid git question 2 22
The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …
What is Node.js? Node.js is a server side scripting language much like PHP or ASP but is used to implement the complete package of HTTP webserver and application framework. The difference is that Node.js’s execution engine is asynchronous and event…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
The viewer will learn how to dynamically set the form action using jQuery.

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

26 Experts available now in Live!

Get 1:1 Help Now