Link to home
Start Free TrialLog in
Avatar of PeterJanousek
PeterJanousek

asked on

LibXml2 doesn't recognize invalid xml

Why does LibXml2 validate standalone xml documents in which there are present attributes, which are delcared in an external dtd and which needs to be normalized in a way, that theyre normalized value would be differend from the value that would be produced in the absence of theyre delcatation?

Here is example xml:
<?xml version='1.0' standalone='yes'?>
<!DOCTYPE attributes SYSTEM "../valid/sa.dtd" [
]>

<attributes
token = "b"
nmtoken = " this-gets-normalized "
/>

Open in new window


and here is the DTD:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT attributes EMPTY>
<!ATTLIST attributes
token (a|b|c) "a"
nmtoken NMTOKEN #IMPLIED
>

Open in new window


This xml and dtd originaly comes from W3C Conformance Test Suite. The xml state that it is standalone. Acording to xml specification (2.9 Standalone Document Declaration)

Validity constraint: Standalone Document Declaration

The standalone document declaration must have the value "no" if any external markup declarations contain declarations of:

attributes with tokenized types, where the attribute appears in the document with a value such that normalization will produce a different value from that which would be produced in the absence of the declaration

Acording to uppermentioned statement, the xml which i posted upper must have standalone='no', becouse there is an external definition of the nmtoken attribute, which is of NMTOKEN type. Becouse of the nmtoken type, the normalized value of nmtoken attribute is the value without any leading and trailing space character and the normalized value would be "this-gets-normalized". But normalized value of nmtoken without presence of the nmtoken declaration in external dtd would be " this-gets-normalized ", becouse without any declaration, the type of nmtoken is CDATA and thus there would be no removing of leading and trailing spaces. So these two normalized values are not the same ant thus xml document must have standalone='no' in xml declaration. Becouse this is not true, the xml should be considered invalid.

Problem is, that libxml2 validate xml without any error. Im using LibXml2 version 2.9.1 on Windows with c++.

Please, do i missunderstood something or there is a bug in LibXml2?
ASKER CERTIFIED SOLUTION
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial