LibXml2 doesn't recognize invalid xml

Why does LibXml2 validate standalone xml documents in which there are present attributes, which are delcared in an external dtd and which needs to be normalized in a way, that theyre normalized value would be differend from the value that would be produced in the absence of theyre delcatation?

Here is example xml:
<?xml version='1.0' standalone='yes'?>
<!DOCTYPE attributes SYSTEM "../valid/sa.dtd" [

token = "b"
nmtoken = " this-gets-normalized "

Open in new window

and here is the DTD:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT attributes EMPTY>
<!ATTLIST attributes
token (a|b|c) "a"

Open in new window

This xml and dtd originaly comes from W3C Conformance Test Suite. The xml state that it is standalone. Acording to xml specification (2.9 Standalone Document Declaration)

Validity constraint: Standalone Document Declaration

The standalone document declaration must have the value "no" if any external markup declarations contain declarations of:

attributes with tokenized types, where the attribute appears in the document with a value such that normalization will produce a different value from that which would be produced in the absence of the declaration

Acording to uppermentioned statement, the xml which i posted upper must have standalone='no', becouse there is an external definition of the nmtoken attribute, which is of NMTOKEN type. Becouse of the nmtoken type, the normalized value of nmtoken attribute is the value without any leading and trailing space character and the normalized value would be "this-gets-normalized". But normalized value of nmtoken without presence of the nmtoken declaration in external dtd would be " this-gets-normalized ", becouse without any declaration, the type of nmtoken is CDATA and thus there would be no removing of leading and trailing spaces. So these two normalized values are not the same ant thus xml document must have standalone='no' in xml declaration. Becouse this is not true, the xml should be considered invalid.

Problem is, that libxml2 validate xml without any error. Im using LibXml2 version 2.9.1 on Windows with c++.

Please, do i missunderstood something or there is a bug in LibXml2?
Who is Participating?
Geert BormansConnect With a Mentor Information ArchitectCommented:
I just tested with libxml, and you are right, this is a bug in libxml
By the way, libxml is not the only parser that actually does this wrong.

In short standalone='yes' tells the parser
"check if not sending the DTD would change the document one way or another"
some parser understand it as
"don't bother looking at the DTD at all"

You have entered a grey area of DTD validation
note that msxml does this right
so if this is C++ you are doing and this feature is important, you could use msxml instead
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.