Link to home
Start Free TrialLog in
Avatar of tesmc
tesmcFlag for United States of America

asked on

replacing and parsing xml message

I have the following xml input

<CommandRS>
 <Response>MI - COLLECT OPTIONAL FEES
&lt;X&gt; X IF SAME
                            &lt;CAD&gt;
&lt; &gt; NAMEID&lt;1.1  &gt; &lt;  &gt;
FEE CODE&lt;438&gt;  ANCILLARY SEAT FEE      FEE AMT&lt;   15.00&gt;
TKT NBR&lt;838574006&gt;CPN&lt;01&gt;FLT 1141 DTE 07AUG BRDOFF MIAYYZ
 TAXES &lt;       &gt; &lt;   &gt; &lt;       &gt; &lt;   &gt; &lt;       &gt; &lt;   &gt;
&lt; &gt; NAMEID&lt;     &gt;                                  SEG NBR&lt;  &gt;
 </Response>
</CommandRS>


And I want to replace &lt; with "<" and &gt; with ">". So then this looks like

<CommandRS>
 <Response>MI - COLLECT OPTIONAL FEES
<X> X IF SAME
                            <CAD>
< > NAMEID<1.1  > <  >
FEE CODE<438>  ANCILLARY SEAT FEE      FEE AMT<   15.00>
TKT NBR<838574006>CPN<01>FLT 1141 DTE 07AUG BRDOFF MIAYYZ
 TAXES <       > <   > <       > <   > <       > <   >
< > NAMEID<     >                                  SEG NBR<  >
 </Response>
</CommandRS>

I then want to remove any free text that is not inside the <> (while maintaining the spaces within the <> if any found). So that the final output is

<CommandRS>
 <Response>
<X><CAD>< ><1.1  ><  ><438> <   15.00><838574006><01><       ><   ><       ><   ><       ><   >< > <     > <  >
 </Response>
</CommandRS>


i'm feeding this input xml into an *.xsl
Please assist.
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

The easiest way to do this is using the infamous disable-output-escaping

Note that d-o-e is not a mandatory functionality from the XSLT spec,
so processors can rightfully escape from its implementation
(firefox browser XSLT does not support it for instance)

this is what you should do
<xsl:value-of select="Response" disable-output-escaping="yes"/>
By the way, can you put XSLT questions in the XSLT zone, or give an indication in the title? I just coincidentally did not skip the question
Avatar of tesmc

ASKER

I can never find that group when I'm submitting my question. That's why I always just place in CSS/XML.
ah OK, did the d-o-e tric work?
Avatar of tesmc

ASKER

Yes, the disable-output-escaping worked.
ah great

use it sparsely, because it is a bad serialisation trick,
and it could trigger you to build trees in a non conventional way
Avatar of tesmc

ASKER

and how can i resolve the second question:

"I then want to remove any free text that is not inside the <> (while maintaining the spaces within the <> if any found). So that the final output is
<CommandRS>
 <Response>
<X><CAD>< ><1.1  ><  ><438> <   15.00><838574006><01><       ><   ><       ><   ><       ><   >< > <     > <  >
 </Response>
</CommandRS>
"
ah, you basically only want the text inside the '<...>'?

Which XSLT processor are you using?
You would either need a recursive substring algoritm in XSLT1
or some regex stuff in XSLT2
The second one is a whole bunch easier, so it would help if we could use XSLT2 (Saxon 9 that is)
Avatar of tesmc

ASKER

i believe xslt2.
Avatar of tesmc

ASKER

yes, the text inside <..>
For XSLT2, this will do the trick

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="2.0">
    <xsl:template match="Response">
        <xsl:copy>
            <xsl:analyze-string select="." regex="(&lt;[^&gt;]*&gt;)">
                <xsl:matching-substring>
                    <xsl:value-of disable-output-escaping="yes" select="regex-group(1)"></xsl:value-of>
                </xsl:matching-substring>
            </xsl:analyze-string>
        </xsl:copy>
    </xsl:template>
    
</xsl:stylesheet>

Open in new window

Avatar of tesmc

ASKER

Gertone:

That's giving me the following warning:
"xsl:analyze-string cannot be a child of the xsl:copy element. "
well, code was tested,
never seen that warning before, but essentialy I believe that means you don't have an XSLT2 processor
Let me check
ASKER CERTIFIED SOLUTION
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of tesmc

ASKER

Im using visual studio 2008 , where i have the .xsl file and as input the xml . that's how i'm debugging it.
Ah, visual studio uses msxml and that only has XSLT1.
The most recent XSLT will do what you need

I did test with some XSLT1 processors to see what the feedback was on the XSLT2 stylesheet
and I got various messages, I did not test msxml, but the XSLT 1 solution I gave is generic
Avatar of tesmc

ASKER

is it such that your snippet will not work for msxml ?
you mentioned earlier that in that case there was a recursive method option
I am not sure you have seen all my posts.

#a39264302 is an XSLT2 solution to your problem which will not work using msxml

#a39266170 is the recursive XSLT1 solution that is generic and will work in each XSLT1 processor that supports d-o-e (including msxml)
Avatar of tesmc

ASKER

Gertone,
that last code worked perfectly, it replaced the gt lt command and parse the string .

may i ask that you please explain what this code does.why do you have the
disable-output-escaping associated to "&lt" and not the entire node as when
<xsl:value-of select="Response" disable-output-escaping="yes"/> ?
Avatar of tesmc

ASKER

So the output returns
<Response><X><CAD>< ><1.1  ><  ><438><   15.00><838574006><01><       ><   ><       ><   ><       ><   >< ><     ><  ></Response>

which is great. how can end the parsing with an <X>?  such that
<Response><X><CAD>< ><1.1  ><  ><438><   15.00><838574006><01><       ><   ><       ><   ><       ><   >< ><     ><  ><X></Response>
Avatar of tesmc

ASKER

Sorry for asking many questions, I'm just trying to familiarize myself with this.
Now that the output gets parsed successfully, why am I unable to add an attribute? for example


      <xsl:attribute name="Version">
        <xsl:text>XML1.0.1</xsl:text>
      </xsl:attribute>

        
I get "An item of type 'Attribute' cannot be constructed within a node of type 'Root'."

I want to response to read
<CommandRQ Version="2003A.TsabreXML1.0.1">
 <Request>
<X><CAD>< ><1.1  ><  ><438> <   15.00><838574006><01><       ><   ><       ><   ><       ><   >< > <     > <  ><X>
 </Request>
</CommandRQ>
Avatar of tesmc

ASKER

thank you.
welcome,

seems I missed a couple of follow up over the weekend.

Your last question... I assume you were trying to add the attribute to the document root, not the root element. I think I showed a good way in your other question

Your second I did not get. Do you want to add a <X> after the parsing, or do you want the parsing to end when it hits a <X>?

First. Well, the d-o-e disables the escaping forcing a &lt; in the output to become a real <
but it is only necessary on the '<' and '>' not on what is inside.