XSL Changing encoding on XML!!

Hi all,
I applied a little XSL to an XML file (using VBScript) but it changed the encoding from UTF-4 to UTF-16.
How can I prevent this OR is there any way I can change it back on the fly after

What I'm doing:-
Creating an XML parser using
Set xDoc = CreateObject("Msxml2.DOMDocument.3.0")

Then I applied the following XSL using:
FormattedXML = xDoc.transformNode(stylesheet)

XSL stylesheet attached.

Many thanks in advance for anyone who wants to help me :)
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:strip-space elements="*" />
  <xsl:output method="xml" indent="yes" />

  <xsl:template match="node() | @*">
    <xsl:copy>
      <xsl:apply-templates select="node() | @*" />
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Open in new window

paddykoolAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

manjunathubCommented:
instead of <?xml version="1.0"?>
this place <?xml version="1.0" encoding="utf-8"?>
0
Gertone (Geert Bormans)Information ArchitectCommented:
you can specify the encoding you want in the xsl:output element
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
msxml mostly defaults to UTF-16

I have never heard of UTF-4, maybe you mean UCS-4, you should use encoding="UTF-32" then

only the encodings UTF-8 and UTF-16 have to be supported by processors,
so it is by the courtesy of the processor that other encodings are supported
It is for that wise to pick a more modern transformer, Msxml2.DOMDocument.4.0 or Msxml2.DOMDocument.6.0 eg. (make sure it is installed)

And if this javascript isn in the browser, it all depends on how the transformation result from the XSLT is serialised.
A lot depends on how the post processing is organised, sometimes the serialisation settings are simply ignored

0
Gertone (Geert Bormans)Information ArchitectCommented:
the advice of manjunathub is incorrect
that only has impact on the encoding of the stylesheet itself,
and does not impact the encoding of the output of the transformation

on top of that both are completely equal since UTF-8 is the default encoding of any XML, not having an encoding specified in the XML declaration
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

paddykoolAuthor Commented:
Hi Gertone & manjunathub

Firstly, yes utf-4 is a typo. I meant utf-8

Ok, manjunathub method didn't work.

I then tried adding  encoding to the xsl:output
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
But it still produces XML with the top line being:-
<?xml version="1.0" encoding="UTF-16"?>

I'm completely new to XML so all help is appreciated here.

Is ther any msxml function that I can removing / fudge the encoding value?!?
0
manjunathubCommented:
0
paddykoolAuthor Commented:
Nuts!!!
I cant reach that at work :(
0
Gertone (Geert Bormans)Information ArchitectCommented:
don't worry, the post essentialy says the same as my suggestion a bit earlier, "change the encoding in the xsl:output".

Now, if that doesn't help, you will need to tell us more about how you do the transform.
Where is the VBScript?
In IIS/ASP?
Which version?

Note that APS tends to change the encoding, unless you output the transform in a stream, rather than a buffer,
but I am pretty ASP unaware
0
paddykoolAuthor Commented:
Cheers for getting back to me
Ok, I do the transfrom in a .vbs file which uses the WSH (I think). Basically I have a testing tool (HP QTP) and I call a load of vbs functions stored in .vbs files from it.
I'm using the testing tool to drive an app and compare the XML produced. But I need to format the XML using the XSL before I do a compare but by applying the XSL it's giving an error of something like - "UTF-8 encoding found in XML but header has UTF-16" (sorry for not being specific)
0
Hans LangerCommented:
You can hide the XML declaration from the result, with this:

<xsl:output omit-xml-declaration="yes" encoding="UTF-8"/>

so, the result will not show the "xml tag" --->  <?xml version="1.0"?>
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
paddykoolAuthor Commented:
Ok, I'll give it a go.
One question, does it matter the order I put the attributes after xsl:output?!?
0
Hans LangerCommented:
No, it should not matter.

Unless you need to get the attribute by his index position.
0
paddykoolAuthor Commented:
Hi all again
Ok, omit-xml-declaration="yes" does strip out the Proccessing instruction node <?xml version . . . .?>
but this is really a leeit hack, I kina want to kep it in and product encoding UTF-8.

Thanks for all you help so far on this.

Any one got any idea why <xsl:output method="xml" indent="yes" encoding="UTF-8"/> does not work produce an xml with encoding="UTF-8"?> at the top?!?
0
Gertone (Geert Bormans)Information ArchitectCommented:
As I said before... sometimes these settings are ignored.
Maybe I should explain what I meant with that, or why that is
xsl:output tells the XSLT processor how to serialise the output tree into a unicode text document
But some applications simply take the output tree directly, skipping the serialisation phase.
That means that they have no possibility of keeping the XSLT serialisation settings and just use their own

If you would have a pipeline of XSLT in XProc for instance, all serialisation will be ignored up to the very end

Another note I need to make is that specifying UTF-8 as the encoding, or dropping the xml declaration
by setting the omit-xml-declaration to yes is essentialy the same.
If no declaration is given, the encoding is UTF-8.
I am happy though that you did that test, because you see the encoding disappear, it shows the xsl:output is taken into account,
at least that counts as something

If omitting the declaration solves the problem, please do so.
No one should care that the declaration is absent if you stick to default settings
and it is the final step in the process that controls that anyway

if not solved, we have to carefully split your process in step by step actions and see where it goes wrong

Your message strikes me
"UTF-8 encoding found in XML but header has UTF-16"
That seems to imply that the encoding is UTF-16, despite the declaration saying that it is UTF-8
And that you have another process that feels blocked by that
It sounds as if harm is done prior to the XSLTs
So we might need more detail on the exact process
0
paddykoolAuthor Commented:
Thank you very much for the above comment and apologies for the delay in response.

I'm going to go with "omit-xml-declaration" as this doesn't merit any more investigation (according to my boss!!).
i can live with excluding the top line

Many thanks
0
Gertone (Geert Bormans)Information ArchitectCommented:
welcome
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Languages and Standards

From novice to tech pro — start learning today.