transform to document, single byte

zc2
zc2 used Ask the Experts™
on
I use MSXML from a C++ code. It transforms some XML and XSLT to XHTML. To do that I create an IXSLProcessor out of IXSLTemplate and then call its transform() method passing an empty IXMLDOMDocument object (I want an IXMLDOMDocument be populated with the output because I want to do some additional manipulation with its nodes).
That work fine if the XSLT has an <xsl:output method="xml" encoding="UTF-8"/> instruction.
The C++ code can be compiled in two versions - single byte encoding (windows-1252) and Unicode. For the single byte case the XSLT has <xsl:output method="xml" encoding="windows-1252"/>
If the XML or XSLT has a not ASCII (greater than 127) character, the transform() method fails. It either returns E_FAIL or the output document remains empty.
Is it possible to setup the output document to accept a specific single-byte encoding (windows-1252)?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
ste5anSenior Developer

Commented:
It's hard to tell without code.. and it's not clear, what your use-case is.

When I need to guess, it has do with the encoding detection from the input. So check the input file for its encoding, whether it has a BOM or not and whether the XML itself declares an encoding and whether it is the correct one.

Thus take your existing input file, convert it to UTF-8, add BOM and the encoding attribute. Test it.
Do the same for your code page. Convert the file, add the appropriate encoding attribute and save it using that encoding. Test it.

E.g. use Notepad++ for manipulating the encoding of the file and storing it accordingly.

Author

Commented:
all input files are single byte, no UTF-8, BOMs, etc.
XML:
<inset include="8y46bc"/>

Open in new window

XSL:
<?xml version="1.0" encoding="windows-1252"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" encoding="windows-1252" omit-xml-declaration="yes"/>

<xsl:template match="/inset">
    <!-- template body -->
</xsl:template>
</xsl:stylesheet>

Open in new window

Eduard GherguArchitect - Coder - Mentor

Commented:
Hi,

<inset include="8y46bc"/>
This is all the content of the xml file?
Amazon Web Services

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

ste5anSenior Developer

Commented:
When I need to guess, it has do with the encoding detection from the input
<inset include="8y46bc"/>

Open in new window

When you don't specify an encoding in the file as encoding attribute, then a single byte file is undistinguishable from a UTF-8. You need to test this, as I already wrote:

Change your input file to:
<?xml version="1.0" encoding="windows-1252"?>
<inset include="8y46bc"/>

Open in new window

Author

Commented:
This is all the content of the xml file?
In the test sample I am working with - yes. In production there would be data.
You need to test this, as I already wrote:
I will try that.
I think, that the problem is not with the input but with the output. Since the XSL has the omit-xml-declaration="yes" attribute, the output does not have any declaration, and UTF-8 is assumed.
ste5anSenior Developer

Commented:
Well, what about posing a concise and complete example? When it's VS, then attaching such a project would help. Include your test harness.
Eduard GherguArchitect - Coder - Mentor

Commented:
Hi,

Please, check this: https://code-examples.net/en/q/282df

Author

Commented:
Please, check this
I don't see how is that relevant.
My question is simple - is it possible to tell MSXML Document object to expect a single byte input to load, not UTF8 (except putting the XML declaration)?

Author

Commented:
ste5an,
As I expected, adding the XML declaration to the input XML file does not change a thing.
Only removing the omit-xml-declaration="yes" attribute from XSLT makes the output DOMdocument load it correctly.
But the problem is that I don't want the XML declaration in the output.
Eduard GherguArchitect - Coder - Mentor

Commented:
Hi,

The load method will do just what its name says. What you can do is to modify the XML document properties to fit to your scenario.

Author

Commented:
modify the XML document properties
That's exactly my question. How do I do that?
Ok, since no better way was suggested, I've removed the  omit-xml-declaration="yes" attribute. I have to somehow deal with the XML declaration in the output file.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial