Link to home
Create AccountLog in
Avatar of YRKS
YRKS

asked on

XSLT + Generic exception:Invalid character in the given encoding

Hi

I have a xml file which I transform into an intermediate file using tranform.
Xsl.XslCompiledTransform.transform method I am using iso-8859-1 encoding.

The file I get after the transform is in the format below
"<ROOT><hfield1>aa</hfield1><hfield2>0.00</hfield2><hfield3 /><hfield4>TEST016876</hfield4><hfield5 /><hfield6>bbb</hfield6><hfield7 /><Body1 />                                       
                                        <Body2>:TEST016876</Body2>                                       
                                        <Body3>testing - call type</Body3>                                       
                                        <Body4 />                                       
                                        <Body5>107/01/10 13:47:56</Body5>                                       
                                        <Body6>1</Body6>                                       
                                        <Body7>:DOWN</Body7>                                       
                                        <Body8>01/10/07 13:47:56 hdsup xxx||Info Notify to:cvfrgtdccccccccccccc  Resolution Time= 01/10/07 17:48:00</Body8>                           
                                             
                                             
                                             
                                             
                                             
                                             
                   
       
       
       
       
       
       
      </ROOT>"

Now all the empty spaces I get are square characters.

I need to use another xslt and transform the above file. If I take this message and remove the enter and blankspaces my second transform runs fine. If I run the unchanged message I get the  message Generic exception:Invalid character in the given encoding. Line 1 position 155.

Could someone help please.

Thanks
YRKS
Avatar of Bob Learned
Bob Learned
Flag of United States of America image

Hmmm...I wonder if you could just set OutputSettings.CheckCharacters = False?

Bob
Avatar of YRKS
YRKS

ASKER

       Dim buffer As Byte() = Encoding.GetEncoding("ISO-8859-1").GetBytes(umlText)
        Using input As New MemoryStream(buffer)
            Dim reader As New XmlTextReader(input)
            Dim transform As New Xsl.XslCompiledTransform
           
            Dim xslSettings As New Xsl.XsltSettings()
            xslSettings.EnableScript = True
            transform.Load(xslFileName, xslSettings, New XmlUrlResolver())
            Using output As New MemoryStream()
           Dim writer As New XmlTextWriter(output, Encoding.GetEncoding("ISO-8859-1"))
            transform.Transform(reader, writer)
                Return Encoding.GetEncoding("iso-8859-1").GetString(output.ToArray())            
 End Using
        End Using

I am not sure how I would associate the following 2 lines with XmlTextWriter
 Dim writerSettings As New XmlWriterSettings
            writerSettings.CheckCharacters = False
ie pass writersettings to the writer in the above code.

the XmlTextWriter does not take in a xmlWriterSettings as parameter it takes in argumentlist as a second parameter.
Thanks for all your help.
YRKS
transform.OutputSettings.CheckCharacters = False.

Bob
Avatar of YRKS

ASKER

transform.OutputSettings.CheckCharacters = False
I get xmlwritersettings.checkcharacters property is read only and cannot be set.
That's funny, because this works for me with my WinForms 2.0 application:

    Dim transform As New XslCompiledTransform()
    transform.OutputSettings.CheckCharacters = False

Bob
Avatar of YRKS

ASKER

My project where I am testing this too is a windowsapplication form created in VS 2005.

There is something different I am doing than you. I did realize that you did not need
etTransform.OutputSettings.CheckCharacters = True this line of code to get it working on your machine.  Are you just importing the namespaces or also adding the references of some dll.
I am just importing the namespaces.   Thanks for all your help.
YRKS
I only added an Imports statement at the top of the module.

Bob
Avatar of YRKS

ASKER

Dim transform As New XslCompiledTransform()
    transform.OutputSettings.CheckCharacters = False

When I run  The first  line  Dim transform As New XslCompiledTransform()
 I can see my transform object being created If I look at the outsettings property there the value is nothing and the type as system.xml.xmlwritersettings
The other proertyfor the transform object I see is Temporary files
If I try to run the next statement
I get NullReferenceException was unhandled
Object reference not set to an instance of the object

If I omit this statement and go to the Transform.load (..)

I can see the outputsettings values filled in appropriately.
Thansk in advance for all your help.
YRKS
Avatar of YRKS

ASKER

Just a quick question Could you run the winform2.0 application. As it compiles fine It just gives that error at runtime. I even tried creating a new XMLWriterSettings object asiigning the checkCahracter a false value and then assigning this object to the transform.outputsettings =MyXMlWriter Couldnot get it to work.
YRKS
Try this (untested):


Private Sub TransformXml(ByVal xsltFile As String, ByVal inputFile As String, ByVal outputFile As String) 
    Dim transform As New XslCompiledTransform() 
    transform.Load(xsltFile) 
    Dim settings As New XmlWriterSettings() 
    settings.CheckCharacters = False 
    Dim writer As XmlWriter = XmlWriter.Create(outputFile, settings) 
    Using reader As XmlReader = XmlReader.Create(inputFile) 
        transform.Transform(reader, Nothing, writer, New XmlUrlResolver()) 
    End Using 
End Sub 

Open in new window

Avatar of YRKS

ASKER

It does work but the reason for doing this is still not solved.

I still get the original error except instead of line 1 position 155 I now get line 1 position 181.

I tried using a replace on the string  after first  transform and replaced different or rather all the controlchars and still get the same error and same line number and position number.

"<ROOT><hfield1>aa</hfield1><hfield2>0.00</hfield2><hfield3 /><hfield4>TEST016876</hfield4><hfield5 /><hfield6>bbb</hfield6><hfield7 /><Body1 />                                      
                                        <Body2>:TEST016876</Body2>                                      
                                        <Body3>testing - call type</Body3>                                      
                                        <Body4 />                                      
                                        <Body5>107/01/10 13:47:56</Body5>                                      
                                        <Body6>1</Body6>                                      
                                        <Body7>:DOWN</Body7>                                      
                                        <Body8>01/10/07 13:47:56 hdsup xxx||Info Notify to:cvfrgtdccccccccccccc  Resolution Time= 01/10/07 17:48:00</Body8>  

There is some charactor after the </body1> tag where the error occurs.
These lines don't seem to do the trick.

 Dim settings As New XmlWriterSettings()
    settings.CheckCharacters = False
    Dim writer As XmlWriter = XmlWriter.Create(outputFile, settings)

Thanks alot for being Patient and helping out.
YRKS
Actually, that problem is on the transformation side, and you should be able to do whitespace stripping:

3.4 Whitespace Stripping
http://www.w3.org/TR/xslt#strip

Bob
Avatar of YRKS

ASKER

Tried <xsl:strip-space elements="*"/>
in both the first and second xslt.
Also tried
Dim reader As New XmlTextReader(input)
            reader.WhitespaceHandling = WhitespaceHandling.None
Avatar of YRKS

ASKER

Input string is

pmo,01/10/07 13:47:57,,TEST016876,,HD,,^:TEST016876^testing ^^107/01/10 13:47:56^1^:DOWN^01/10/07 13:47:56 hdsup xxx||Info Notify to:  Resolution Time= 01/10/07 17:48:00|

Transformed string after first transform
"<ROOT><hfield1>pmo</hfield1><hfield2>0.00</hfield2><hfield3></hfield3><hfield4>TEST016876</hfield4><hfield5></hfield5><hfield6>HD</hfield6><hfield7></hfield7><Body1></Body1>                                                       <Body2>:TEST016876</Body2>                                                       <Body3>testing </Body3>                                                       <Body4></Body4>                                                       <Body5>107/01/10 13:47:56</Body5>                                                       <Body6>1</Body6>                                                       <Body7>:DOWN</Body7>                                                       <Body8>01/10/07 13:47:56 hdsup xxx||Info Notify to:  Resolution Time= 01/10/07 17:48:00|</Body8>                                                                                                                                                                                                                                                                                                                           </ROOT>"

The transform.xslt file somehow gives these spaces and all the characters in If I manulally remove all the spaces between the tag </body1>          and <body2> and so on and run my second transform it all works fine but can't get to remove the spaces in between at all.

Thanks for all your help in advance. I am also attaching the transform.xslt so that you can have alook at it.


Transform.aslt code it here
<xsl:stylesheet version ="1.0" xmlns:xsl ="http://www.w3.org/1999/XSL/Transform" >
      <xsl:output method = "xml" encoding="iso-8859-1" indent="no" omit-xml-declaration="yes" />
      <xsl:strip-space elements="*"/>
      <xsl:decimal-format name="dollars" decimal-separator="." grouping-separator="," minus-sign="-" zero-digit="0" digit="#" NaN="0.00"/>
      <xsl:template match ="/" >
      <ROOT>
            <xsl:call-template name="csvtoxml" >
                  <xsl:with-param name="StringToTransform" select="/ROOT" />
                  <xsl:with-param name="FieldNum">1</xsl:with-param>
            </xsl:call-template>
      </ROOT>
      </xsl:template>
      <xsl:template name ="csvtoxml" >
            <xsl:param name ="StringToTransform" />
            <xsl:param name ="FieldNum" />
            <xsl:choose>
                  <xsl:when test ="contains($StringToTransform,',')" >
                        <xsl:element name="hfield{$FieldNum}">
                              <xsl:choose>
                                    <xsl:when test="$FieldNum = 2">
                                          <xsl:value-of select ="format-number(substring-before($StringToTransform,','),'#####0.00;-#####0.00','dollars')" />
                                    </xsl:when>
                                    <xsl:otherwise>
                                          <xsl:value-of select ="substring-before($StringToTransform,',')" />
                                    </xsl:otherwise>
                              </xsl:choose>
                        </xsl:element>
                        <xsl:choose>
                              <xsl:when test ="$FieldNum &lt; 7">
                                    <xsl:call-template name ="csvtoxml" >
                                          <xsl:with-param name ="StringToTransform" select ="substring-after($StringToTransform,',')" />
                                          <xsl:with-param name ="FieldNum" select="$FieldNum + 1"/>
                                    </xsl:call-template>
                              </xsl:when>
                              <xsl:otherwise>
                                    <xsl:call-template name="SplitChar">
                                          <xsl:with-param name="str" select="substring-after($StringToTransform,',')" />
                                          <xsl:with-param name="num" select="1"/>
                                    </xsl:call-template>
                              </xsl:otherwise>
                        </xsl:choose>
                  </xsl:when>
            </xsl:choose> 
      </xsl:template>
      <xsl:template name="SplitChar">
            <xsl:param name="num" />
            <xsl:param name="str" />
            <xsl:choose>
                  <xsl:when test="contains($str, '^')">
                        <xsl:call-template name="SplitChar">
                              <xsl:with-param name="str" select="substring-before($str, '^')" />
                              <xsl:with-param name="num" select="$num"/>
                        </xsl:call-template>                                       
                <xsl:call-template name="SplitChar">
                 <xsl:with-param name="str" select="substring-after($str, '^')" />
                 <xsl:with-param name="num" select="$num + 1"/>
                 </xsl:call-template>                           
           </xsl:when>
           <xsl:otherwise>
                 <xsl:element name="{concat('Body', $num)}">
                   <xsl:value-of select="$str" />
                  </xsl:element>
           </xsl:otherwise>
       </xsl:choose>
    </xsl:template>
</xsl:stylesheet>
YRKS
I am getting a "Data at the root level is invalid. Line 1, position 1." for that text fragment that isn't XML.

Bob
Avatar of YRKS

ASKER

Sorry forgot to mention you need to add <Root></Root> tag to the input string.

So the string becomes
<Root>pmo,01/10/07 13:47:57,,TEST016876,,HD,,^:TEST016876^testing ^^107/01/10 13:47:56^1^:DOWN^01/10/07 13:47:56 hdsup xxx||Info Notify to:  Resolution Time= 01/10/07 17:48:00|</Root>

This is all I got in the output:

<ROOT />

Bob
Avatar of YRKS

ASKER


Dim strTransformedMsg As String
   
        strTransformedMsg = TransformString("transform.xslt", sMqAsXml)


Public Shared Function TransformString(ByVal xslFileName As String, ByVal umlText As String) As String
        Dim buffer As Byte() = Encoding.GetEncoding("ISO-8859-1").GetBytes(umlText)  'UTF8.GetBytes(umlText)  '

        Using input As New MemoryStream(buffer)
            Dim reader As New XmlTextReader(input)
            Dim transform As New Xsl.XslCompiledTransform
            Dim xslSettings As New Xsl.XsltSettings()
           
            xslSettings.EnableScript = True
            transform.Load(xslFileName, xslSettings, New XmlUrlResolver())

            Using output As New MemoryStream()
                Dim writer As New XmlTextWriter(output, Encoding.GetEncoding("ISO-8859-1")) 'Encoding.UTF8) '
                transform.Transform(reader, writer)
                Return Encoding.GetEncoding("iso-8859-1").GetString(output.ToArray()) 'UTF8.GetString(output.ToArray)  '
            End Using
        End Using
    End Function

Also for the input string list the <ROOT></ROOT>
I remeber getting that error as the thing is case sensitive.
Thanks
YRKS
It was the <ROOT> problem.  Now I get this:

<ROOT><hfield1>pmo</hfield1><hfield2>0.00</hfield2><hfield3 /><hfield4>TEST016876</hfield4><hfield5 /><hfield6>HD</hfield6><hfield7 /><Body1 /><Body2>:TEST016876</Body2><Body3>testing </Body3><Body4 /><Body5>107/01/10 13:47:56</Body5><Body6>1</Body6><Body7>:DOWN</Body7><Body8>01/10/07 13:47:56 hdsup xxx||Info Notify to:  Resolution Time= 01/10/07 17:48:00|</Body8></ROOT>
I don't see any problem with that.

Bob
Avatar of YRKS

ASKER

Yes there is no problem with this piece of code but the xml you get back does it have empty spaces between the </body1>                   <body2>ssssss</body2>                         <body3> sssssssssssss</body3>

I have to take this xml and transform into something else.

I cannot get to remove these empty sapces at all.
Can you just copy the output xml and paste it here.

Thanks
YRKS
Avatar of YRKS

ASKER

I came across a post in experts-exchange yesterday they also had the same problem. They were tryiong to remove the empty spaves between the tags and finally just ended up doing string manipulation.

I donot want to do string manipulation, so my work around would be to write this intermediate xml to a file and save it, and then transform that works fine.

I was trying to avoid to write to a file and save it to the disk.
YRKS
ASKER CERTIFIED SOLUTION
Avatar of Bob Learned
Bob Learned
Flag of United States of America image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Avatar of YRKS

ASKER

Surprising I  debug and then copy the value to notepad and can see the spaces. Anyway Thanks for all your help. I nacse you come across something please put it in the discusiion of this post I will be checking up regularly.
Avatar of YRKS

ASKER

Thanks for all your  help.
YRKS
I don't see spaces in the debugger for this line either:

Dim result As String = Encoding.GetEncoding("iso-8859-1").GetString(output.ToArray())

Bob
Avatar of YRKS

ASKER

You know what I will just go ahead and create my transform.xslt again and see if I have somehow created empty spaces while writing the transform.xslt. I will check and let you know.

I really appreciate all your time and help.
YRKS
I took the .xslt text that you posted, and the input text with the <ROOT> wrapper, and it worked.  I have Windows XP SP2 with 2005 Professional SP1 (if that makes a difference).

Bob