XSLT: how to add bookmarks based on h1 nodes in one file and link to them in another

Case xml file (the solution to the described problem should of course work with any other xhtml file):

<body>
  <h1>titel 1</h1>
  <p>some body <b>content</b> here</p>
  <div>
    <h1>2nd title</h1>
    <p>lorem etc.</p>
  <div>
  untagged content
  <div><span><h1>last titel</h1>test test test test</span></div>
</body>

Using two seperate XSLT files I want to generate two seperate HTML files from the above XML.

The first file (index.htm) will contain a list with links to bookmarks in the second file (main.htm).
So the first XSLT needs to generate a list of links en the second one needs to insert bookmarks just before every h1 tag.

The first XSLT file I have already working:

<xsl:template match="//h1">
    <a class="index-header" href="info.asp#bookmark{position()}" target="main">
        <xsl:value-of select="node()"/>       
    </a>
</xsl:template>

But I am not sure how to get the second one, the one that will generate main.htm.
Everything I try will result in a list of the bookmarks after the body but that's of course not what I want, as said I want the bookmarks inserted just before the related h1 tag.

Regards,
Benny

bvluggenAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sparkplugCommented:
Hi,

Try the following XSLT which uses the identity template:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
      <xsl:template match="@*|node()">
            <xsl:copy>
                  <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
      </xsl:template>
      <xsl:template match="h1">
            <a name="#bookmark{generate-id(.)}"/>
            <xsl:copy>
                  <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
      </xsl:template>
</xsl:stylesheet>

You can't really use position() in this example as it returns only the position of the current node relative to the parent node. Both title 2 and title 3 will therefore be named bookmark1. The generate-id() function can be used to generate a unique id.


>S'Plug<
bvluggenAuthor Commented:
Hi sparkplug,

Thanks for your comment.

If i would use your XSLT how would I need to change the first XSLT so it 'knows' about the generated bookmarks?
If I replace the position function in the first XSLT with the generate-id function won't it generate different id's? How dow I get the same names for either the bookmark or the link in both generated xhtml files?

Benny
sparkplugCommented:
Good point.

I had thought of that, but for some reason, assumed that if you put the generate-id(.) in the first XSLT as well it would generate the same numbers in both. It doesn't. Note however that the position() function doesn't produce unique bookmarks in either XSLT. The first one generates 1 for each bookmark.

Use the following function instead:

<a name="#bookmark{count(preceding::h1)+1}"/>

This will count the number of preceding h1 tags plus one.

>S'Plug<

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Become a CompTIA Certified Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

bvluggenAuthor Commented:
I just tested it and as expected the generated id's will differ in both files.

I was thinking that maybe we could the use the content from the h1 tags and use this to generate a unique compact id?
I think it's save to assume that the content for all h1 tags is unique on a page. What do you think?
I did some testing and it seems to work. The only problem is that the content of some of the h1 tags can be large and contain characters that would break the whole linking idea.

Do you know how to write an XSLT function that generates an id based on the text content of the h1 tag. And for example strips spaces, vowels and charaters not in the alfabet and maybe limits the size to 20 positions?

Benny

bvluggenAuthor Commented:
Whoops; missed your comment of 9:21h. I will test it in a minute...
bvluggenAuthor Commented:
Hi S'Plug,

Yep, we got a winner; count(preceding::h1)+1 does the trick!

Thanks alot!
Regards Benny
petiexCommented:
Hold on just a minute there, S'Plug. Good solution, but I must protest your libelous statement against the generate-id() function. ;-)

I'm pretty sure that generate-id() does generate the same id for a given node in a given document even if you are calling it from different templates or different stylesheets. I mean, otherwise, what good would the function be? Using the following xsl to generate the index.htm, you get the same id's (hence matching anchor tags) that were generated by your first xsl for the main.htm file:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="/">
    <html><body>
    <xsl:apply-templates select="//h1"/>
    </body></html>
</xsl:template>

<xsl:template match="h1">
    <a class="index-header" href="main.htm#bookmark{generate-id()}" target="main">
        <xsl:value-of select="."/><br/>      
    </a>
</xsl:template>
</xsl:stylesheet>
sparkplugCommented:
Not with MSXML 4. On the example XML given, my XSLT transforms to:

<body>
      <a name="#bookmarkh151003368" />
      <h1>titel 1</h1>
      <p>some body <b>content</b> here</p>
      <div>
            <a name="#bookmarkh151004120" />
            <h1>2nd title</h1>
            <p>lorem etc.</p>
      </div>
  untagged content
  <div>
            <span>
                  <a name="#bookmarkh150401224" />
                  <h1>last titel</h1>test test test test</span>
      </div>
</body>

Your XSLT transforms to:

<html><body><a class="index-header" href="main.htm#bookmarkh147482984" target="main">titel 1<br></a><a class="index-header" href="main.htm#bookmarkh147483784" target="main">2nd title<br></a><a class="index-header" href="main.htm#bookmarkh147484472" target="main">last titel<br></a></body></html>

In fact the IDs are different every time either of the transformations are repeated.

Here's an except from the W3C Recomendation http://www.w3.org/TR/xslt

----------------------------------------------------------------
Function: string generate-id(node-set?)

The generate-id function returns a string that uniquely identifies the node in the argument node-set that is first in document order. The unique identifier must consist of ASCII alphanumeric characters and must start with an alphabetic character. Thus, the string is syntactically an XML name. An implementation is free to generate an identifier in any convenient way provided that it always generates the same identifier for the same node and that different identifiers are always generated from different nodes. An implementation is under no obligation to generate the same identifiers each time a document is transformed. There is no guarantee that a generated unique identifier will be distinct from any unique IDs specified in the source document. If the argument node-set is empty, the empty string is returned. If the argument is omitted, it defaults to the context node.
----------------------------------------------------------------

Note the line "An implementation is under no obligation to generate the same identifiers each time a document is transformed. "

QED :^)

>S'Plug<
bvluggenAuthor Commented:
Hi Petiex,

First of all thank you for your comment.

Although your comment is directed at S'Plug I felt I had to do one more test using the generate-id func. After all if you are right then I would have been not telling the truth, right. ;-)  ...but again the test showed that the thwo htm files end up with (slightly) differant id's.

For the first 3 links the following id's are generated: kIDALA0W,kIDANA0W, kIDARA0W
For the first 3 anchors following id's are generated: kIDALI0W,kIDANI0W,kIDARI0W

Maybe this is a parser thing? I am using MSXML's 4 XSLT parser. But than again, if you think about it, doesn't it make sense that a function that's supposed to generate a unique id will do so every time it is called? What I was asking for isn't a unique id but a context related static id that only changes when the content changes and that's what S'Plug delivered.


Regards,
Benny
petiexCommented:
[rant]
Sometimes Microsoft makes me so mad, I could just shake my tiny fists in impotent rage. I'm sure they have lawyers reading through the standards to determine the minimum required functionality.
Lawyer: "Master, look! We're under no obligation!"
Gates: "Excellent. Everything is falling into place. Mwuhaha."
[/rant]

Xalan does generate the same id for every transformation, for what it's worth. Thanks for the heads-up about the evil empire. I'm glad my company is only doing server-side transformations. For now, anyway.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
XML

From novice to tech pro — start learning today.