Solved

tokenize() within tokenize() not working right ...

Posted on 2007-03-18
4
275 Views
Last Modified: 2013-11-18
Hi, I have xml like this (list of author names broken apart by "^"):
<AU>Graziano, Nicole^McGuire, Michael^Adams, Craig^Roberson, Alan^Jiang, Hua^Blute, Nicole</AU>

That I need to turn into this:
           <contrib contrib-type="author">
             <name>
               <surname>Graziano</surname>
               <given-names>Nicole</given-names>
             </name>
             <name>
               <surname>McGuire</surname>
               <given-names>Michael</given-names>
             </name>
             <name>
               <surname>Adams</surname>
               <given-names>Craig</given-names>
             </name>
             <name>
               <surname>Roberson</surname>
               <given-names>Alan</given-names>
             </name>
             <name>
               <surname>Jiang</surname>
               <given-names>Hua</given-names>
             </name>
             <name>
               <surname>Blute</surname>
               <given-names>Nicole</given-names>
             </name>
         </contrib>
        </contrib-group>

I have tried to use tokenize within tokenize - it successfully parses the first tokenize, but is having problems with the second:

 <xsl:template match="AU">
        <contrib contrib-type="author">
            <xsl:choose>
                <xsl:when test="contains(., '^')">
                    <xsl:for-each select="tokenize(.,'\^')">
                        <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                <xsl:for-each select="tokenize(.,',')">
                                    <name>
                                        <surname>
                                            <xsl:value-of select=".[1]"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select=".[2]"/>
                                        </given-names>
                                    </name>
                                </xsl:for-each>
                            </xsl:when>
                        </xsl:choose>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>Doesn't contain a carat</xsl:otherwise>
            </xsl:choose>
        </contrib>
    </xsl:template>

The output with the above xslt I get is:
<contrib-group><contrib contrib-type="author"><name><surname>Graziano</surname><given-names/></name><name><surname> Nicole</surname><given-names/></name><name><surname>McGuire</surname><given-names/></name><name><surname> Michael</surname><given-names/></name><name><surname>Adams</surname><given-names/></name><name><surname> Craig</surname><given-names/></name><name><surname>Roberson</surname><given-names/></name><name><surname> Alan</surname><given-names/></name><name><surname>Jiang</surname><given-names/></name><name><surname> Hua</surname><given-names/></name><name><surname>Blute</surname><given-names/></name><name><surname> Nicole</surname><given-names/></name></contrib></contrib-group>
which is not quite right.

Anyone have any ideas or alternate ways to achieve this?
0
Comment
Question by:PurpleSlade
  • 2
  • 2
4 Comments
 
LVL 2

Author Comment

by:PurpleSlade
Comment Utility
Well, I got it to do what I needed it to do since authors will always have a , delimited name.


    <xsl:template match="AU">
        <contrib contrib-type="author">
            <xsl:choose>
                <xsl:when test="contains(., '^')">
                    <xsl:for-each select="tokenize(.,'\^')">
                        <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                    <name>
                                        <surname>
                                            <xsl:value-of select="substring-before(.,',')"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select="substring-after(.,',')"/>
                                        </given-names>
                                    </name>
                            </xsl:when>
                        </xsl:choose>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>Doesn't contain a carat</xsl:otherwise>
            </xsl:choose>
        </contrib>
    </xsl:template>

However, if any experts want to tell me how silly I was to try it this way and would like to pose another solution, I am quite open to it and would love to see an alternative.  I am very new to xslt so I am quite interested to see better ways of doing things.

:-)
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
Comment Utility
There were two mistakes in your first stylesheet
Here is a corrected version

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output indent="yes"/>
    <xsl:template match="AU">
        <contrib contrib-type="author">
            <xsl:choose>
                <xsl:when test="contains(., '^')">
                    <xsl:for-each select="tokenize(.,'\^')">
                        <name>
                            <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                        <surname>
                                            <xsl:value-of select="tokenize(.,',')[1]"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select="tokenize(.,',')[2]"/>
                                        </given-names>
                            </xsl:when>
                        </xsl:choose>
                        </name>
                    </xsl:for-each>
                </xsl:when>
                <xsl:otherwise>Doesn't contain a carat</xsl:otherwise>
            </xsl:choose>
        </contrib>
    </xsl:template>
   
</xsl:stylesheet>

1. The <name> literal tag was positioned at the wrong location, for that, you created too many <name> nodes
2. Using tokenize with the ',', you create a sequence of two items. Then you iterate over each one of them (iteration of two)
Because of that the current context is a single token from the tokenize and the .[2] will allways be null.
That is why every name is allways stuffed in <sur-name>

You could solve issue 2. the way I show in my template, so without a for-each.
This means calling the tokenize() twice, which means more expensive then the substrings
You could store the sequence in a variable though

                            <xsl:choose>
                            <xsl:when test="contains(.,',')">
                                <xsl:variable name="sq-names" select="tokenize(.,',')"/>
                                        <surname>
                                            <xsl:value-of select="$sq-names[1]"/>
                                        </surname>
                                        <given-names>
                                            <xsl:value-of select="$sq-names[2]"/>
                                        </given-names>
                            </xsl:when>
                        </xsl:choose>

I think this is how I would solve it, if there were potentially more than two names
If there were only two names, I would use substring-before and -after as you do

cheers

Geert
0
 
LVL 2

Author Comment

by:PurpleSlade
Comment Utility
Gertone, thanks!
0
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
welcome
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
JavaScript has plenty of pieces of code people often just copy/paste from somewhere but never quite fully understand. Self-Executing functions are just one good example that I'll try to demystify here.
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now