Link to home
Start Free TrialLog in
Avatar of SAbboushi
SAbboushiFlag for United States of America

asked on

How could there be confusion in a schema document re: schema language vs. author elements?

Can someone please help me see what I'm missing: the reason cited for prefixing schema elements makes no sense to me:
Each of the elements in the schema has a prefix xsd: which is associated with the XML Schema namespace through the declaration, xmlns:xsd="http://www.w3.org/2001/XMLSchema", that appears in the schema element. The prefix xsd: is used by convention to denote the XML Schema namespace, although any prefix can be used. The same prefix, and hence the same association, also appears on the names of built-in simple types, e.g. xsd:string. The purpose of the association is to identify the elements and simple types as belonging to the vocabulary of the XML Schema language rather than the vocabulary of the schema author.

I am unable to conceive of a valid schema document that can use non-XML Schema language vocabulary for element names and attribute names. e.g. <elemment naem="xyz" type="xsd:string"/>
And I don't believe prefixing them will make the declaration any more valid:
<myNamespace:elemment naem="xyz" type="xsd:string"/>

Another way to look at this is that element names used within an XML Schema document strike me as being syntactical in nature: my schema is invalid if I use an element named "elemment" or an attribute named "naem"

The only scenario I can devise to (almost) make sense of the W3 statement above is if an author were to create an XML document instance (i.e. not a Schema document) with a root element <schema>... which would seem to me a dubious practice...

I'm reminded of what Geert Bormans recently said to me about a related post:
[It] does not add real information, so it is just syntactic overhead

Doesn't that apply here?  Are there scenarios whereby one can use non-Schema component element names as element names within an XML Schema document?
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

This is actually a very interesting question

Let us make one asumption first. Allthough the namespace recommendation allows multiple default namespaces to be used (by having different xmlns="..." at different levels) This si not a sensible thing to do. And it is recommended that only one default namespace is used in an XML, preferably bound at the top level and well choosen.

"Syntactic overhead" right for the string value of the prefix (we don't care what it is, but we recommend that you don't choose a confusing prefix). But often it can not be abandonned without breaking the above assumption. So basically, you would often need a prefix, but an XML processor should not care what the actual prefix looks like. (this takes away some of the strictness of "syntactic overhead")

Allthough a validator does not really use the schema for schema in the validation process. The validator knows the rules for a valid schema. In order to know it is processing a schema, the elements need to be bound to the correct namespace.

But in line with my statement about "syntactic overhead"... we don't necessarily need a prefix for that binding.

So, let us find out when it would be recommended to use the xsd: or xs: prefix (both are used actually)

If you had a schema with no target namespace and no annotations of a certain type, you can drop the prefix and many will do so (I would in some cases)

If you had annotations... a construct in a schema can have an xs:annotation. Inside that annotation you could have an xs:documentation... and inside that documentation element, any namespace would be allowed
It is very common that documentation in a schema is created in the html or docbook namespace... You could then use an XSLT to derive a docbook or html documentation from the schema itself (a bit like javadoc... mainly the larger enterprises work that way, I did so on many occasions) Editors such as Oxygen have the documentation generation processes built in.
So this answers the question
Are there scenarios whereby one can use non-Schema component element names as element names within an XML Schema document?

Now, if you had a targetNamespace... ("you" being the "schema author")
<xs:simpleType name="my-type">...
note then that my-type is declared in the schema target namespace
if you then want to reference it  ...
<xsl:element name="my-element" type="my-type">
this will only work if the targetNamespace of the schema also is the default namespace of the schema
if not, you need to bind a prefix to the targetNamespace
and reference with the
<xsl:element name="my-element" type="ns:my-type">
though you still would have
<xs:simpleType name="my-type">...
(you must never qualify the value of @name, simply because each construct declared in a schema, automatically lands in the targetNamespace

Now you have a choice to make, prefix the schema namespace, prefix the targetNamespace or both
I almost always prefix the schema namespace with xs:, sometimes I prefix the targetNamespace (if I have a bigger schema compound from multiple namespaces), sometime I default the targetNamespace
But all of that is a choice

I do quite a bit of advanced schema work
Over 80% of my schemata have a xs: prefix for the schema namespace
And the ones I publish to a wider audience... 99%,
simply because there is a difference between XML subtelty and common practice
(a lot of schema users would not spot an XML schema if it did not have the xs: prefix or xsd: prefix,
I think in one of my texts earlier, I must have mentioned the difference between technical sense and human expectations, you would not believe how hard it is in my XML or schema classes to make people understand that the use of "xs:" does not necessarily means it is in the schema namespace)
Avatar of SAbboushi

ASKER

Thanks - your posts teach me new things!

Maybe you could comment on my question re: whether the reason cited for prefixing schema elements has merit:
The purpose of the association is to identify the elements and simple types as belonging to the vocabulary of the XML Schema language rather than the vocabulary of the schema author.

The example you gave where element names within a schema could be non-XML Schema component names (or actually COULD be XML Schema component names, but are not validated as part of the schema syntax) is within the XML Schema document element.  It doesn't seem to me that an XML schema processor would see the content of such a document element as containing actual XML Schema elements (I see the content of a documentation element as being a value as opposed to an XML Schema element).  

My point is that I believe there is a fixed vocabulary within the XML Schema namespace resulting in a specific syntax for authoring a schema document.  It seems to me that this precludes the use of any element names other than e.g. element, attribute, sequence, complexType, etc... which are not declared within that fixed vocabulary... and I believe in order for the schema document to be valid, these names can ONLY come from the XML Schema namespace...

So within a valid schema document, <element ....> or <attribute ...> or <sequence ...> etc... (with the exception of being within an annotation as you pointed out) must always be referring to the XML Schema element named "element", "attribute", etc... isn't that so?  I cannot conceive of a case where it could belong to a different namespace and still be a valid schema document, hence the redundancy of the prefix in xs:element... unless I'm missing something.

But maybe I'm still misunderstanding something, because here's another link which seems to argue the same point:
When developing schemas that are not associated with a target namespace, you should always explicitly qualify schema elements (like xs:element) to keep them from being confused with global declarations for your application.

So please someone, show me a schema where one could be confused as suggested by these statements...?  By confused, I mean that a knowledgeable person such as Geert would have disambiguation problems with the schema document, not a noob re: the difference between technical sense and human expectations.

(you must never qualify the value of @name, simply because each construct declared in a schema, automatically lands in the targetNamespace
Thanks - that wasn't clear to me before.  As a point of clarification, I'm unclear on whether local element and attribute declarations land in the targetNamespace.  My suspicion is they do not?

Bottom line: it seems to me that only non-XML built-in datatypes (and not elements) need to be disambiguated from the XML Schema namespace

er... maybe you could clarify for me what you mean...? re:
the use of "xs:" does not necessarily means it is in the schema namespace
About the comment part on
the purpose of the association
In a sense that is correct. It is to make clear which vocabulary a component belongs to. I tried to show, when exactly that is an issue and when not
About documentation being a value as oposed to a schema component.
Xs:document is a schema component.nyou can consider the content of it as a value. But it still is xml, so an xml processor requires to know this value is to be considered in a different namespace
The schema working group did an effort to make a schema to be a document complying to xml rules. Xml elements can have semantics if they are associated to a namespace that implies the semantics. There is a mechanism for associating a namespace... A binding to a prefix or a binding to the default namespace. So that mechanism is used.
I think I see what you are having issues with. Your point is that using the binding mechanism is overhead because if you tell a validator that something is a schema, th validator schould not care about the namespace at all. But that simply is not how Xml works.
A browser is learned to still understand h1, even when the elements are not bound to a namespace (and you might know all the quirks as a result of that)
But for xml it was agreed that the binding would always be explicit. Applications other than a validator could operate on a schema and would find it usefull to know what are the schema elements. Examples are: an Xslt that transforms a parameterized schema into one that can be used for validation, a process that pulls documentation out of a schema.
Xml has a strict nature, more related to the strictness of a programming language than to the flexibility of html
ASKER CERTIFIED SOLUTION
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks again for all the time and effort you've put into your responses.  With Regards
Samir