Solved

tokenize() within tokenize() not working right ...

Posted on 2007-03-18
4
286 Views
Last Modified: 2013-11-18
Hi, I have xml like this (list of author names broken apart by "^"):
<AU>Graziano, Nicole^McGuire, Michael^Adams, Craig^Roberson, Alan^Jiang, Hua^Blute, Nicole</AU>

That I need to turn into this:
           <contrib contrib-type="author">
             <name>
               <surname>Graziano</surname>
               <given-names>Nicole</given-names>
             </name>
             <name>
               <surname>McGuire</surname>
               <given-names>Michael</given-names>
             </name>
             <name>
               <surname>Adams</surname>
               <given-names>Craig</given-names>
             </name>
             <name>
               <surname>Roberson</surname>
               <given-names>Alan</given-names>
             </name>
             <name>
               <surname>Jiang</surname>
               <given-names>Hua</given-names>
             </name>
             <name>
               <surname>Blute</surname>
               <given-names>Nicole</given-names>
             </name>
         </contrib>
        </contrib-group>

I have tried to use tokenize within tokenize - it successfully parses the first tokenize, but is having problems with the second:

 <xsl:template match="AU">
        <contrib contrib-type="author">
            <xsl:choose>
                <xsl:when test="contains(., '^')">
                    <xsl:for-each select="tokenize(.,'\^')">
                        <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                <xsl:for-each select="tokenize(.,',')">
                                    <name>
                                        <surname>
                                            <xsl:value-of select=".[1]"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select=".[2]"/>
                                        </given-names>
                                    </name>
                                </xsl:for-each>
                            </xsl:when>
                        </xsl:choose>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>Doesn't contain a carat</xsl:otherwise>
            </xsl:choose>
        </contrib>
    </xsl:template>

The output with the above xslt I get is:
<contrib-group><contrib contrib-type="author"><name><surname>Graziano</surname><given-names/></name><name><surname> Nicole</surname><given-names/></name><name><surname>McGuire</surname><given-names/></name><name><surname> Michael</surname><given-names/></name><name><surname>Adams</surname><given-names/></name><name><surname> Craig</surname><given-names/></name><name><surname>Roberson</surname><given-names/></name><name><surname> Alan</surname><given-names/></name><name><surname>Jiang</surname><given-names/></name><name><surname> Hua</surname><given-names/></name><name><surname>Blute</surname><given-names/></name><name><surname> Nicole</surname><given-names/></name></contrib></contrib-group>
which is not quite right.

Anyone have any ideas or alternate ways to achieve this?
0
Comment
Question by:PurpleSlade
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 2

Author Comment

by:PurpleSlade
ID: 18746242
Well, I got it to do what I needed it to do since authors will always have a , delimited name.


    <xsl:template match="AU">
        <contrib contrib-type="author">
            <xsl:choose>
                <xsl:when test="contains(., '^')">
                    <xsl:for-each select="tokenize(.,'\^')">
                        <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                    <name>
                                        <surname>
                                            <xsl:value-of select="substring-before(.,',')"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select="substring-after(.,',')"/>
                                        </given-names>
                                    </name>
                            </xsl:when>
                        </xsl:choose>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>Doesn't contain a carat</xsl:otherwise>
            </xsl:choose>
        </contrib>
    </xsl:template>

However, if any experts want to tell me how silly I was to try it this way and would like to pose another solution, I am quite open to it and would love to see an alternative.  I am very new to xslt so I am quite interested to see better ways of doing things.

:-)
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 18746674
There were two mistakes in your first stylesheet
Here is a corrected version

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output indent="yes"/>
    <xsl:template match="AU">
        <contrib contrib-type="author">
            <xsl:choose>
                <xsl:when test="contains(., '^')">
                    <xsl:for-each select="tokenize(.,'\^')">
                        <name>
                            <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                        <surname>
                                            <xsl:value-of select="tokenize(.,',')[1]"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select="tokenize(.,',')[2]"/>
                                        </given-names>
                            </xsl:when>
                        </xsl:choose>
                        </name>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>Doesn't contain a carat</xsl:otherwise>
            </xsl:choose>
        </contrib>
    </xsl:template>
   
</xsl:stylesheet>

1. The <name> literal tag was positioned at the wrong location, for that, you created too many <name> nodes
2. Using tokenize with the ',', you create a sequence of two items. Then you iterate over each one of them (iteration of two)
Because of that the current context is a single token from the tokenize and the .[2] will allways be null.
That is why every name is allways stuffed in <sur-name>

You could solve issue 2. the way I show in my template, so without a for-each.
This means calling the tokenize() twice, which means more expensive then the substrings
You could store the sequence in a variable though

                            <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                <xsl:variable name="sq-names" select="tokenize(.,',')"/>
                                        <surname>
                                            <xsl:value-of select="$sq-names[1]"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select="$sq-names[2]"/>
                                        </given-names>
                            </xsl:when>
                        </xsl:choose>

I think this is how I would solve it, if there were potentially more than two names
If there were only two names, I would use substring-before and -after as you do

cheers

Geert
0
 
LVL 2

Author Comment

by:PurpleSlade
ID: 18748259
Gertone, thanks!
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 18749965
welcome
0

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question