Solved

How to convert url within an XML node to a hyperlink during XSLT transformation?

Posted on 2009-05-11
10
1,092 Views
Last Modified: 2013-11-18
Hello Everyone,

I need to convert a URL within an XML node into a hyperlink during XSLT transformation.

<xml>
  <mydata>
    this is my data.  my homepage is http://www.mypage.com.  please visit!
  </mydata>
</xml>

How would I convert the URL to a hyperlink?  My URLs always start with "http://".

I am stuck using MSXML 4.0 as my processor.

Thanks in advance for any help provided.
0
Comment
Question by:greatseats
  • 5
  • 3
  • 2
10 Comments
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 24355320
This is a direct use-case for XSLT2, but I accept that you are bound to msxml4
(please check carefully that you are really bound to msxml4, since there is a .net version of saxon for XSLT2)

Anyway, here is a first stab at it,
If you really need this to be bullet proof, it is a lot of testing etc...
starting the url is easy, ending it is a lot harder
So, how predicatable is your data?
You allready get a good example here: a '.' is allowed in a url, but not as the last character.
It is tough having it as the last character with the space

Can you have multiple http in one text block? If that is teh case, we need recursion.

This can become a very complex issue, so it is important that we allign very good what is expected in the data and what not
<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="mydata">

    <xsl:choose>

        <xsl:when test="contains(., 'http://')">

            <xsl:value-of select="substring-before(., 'http://')"/>

            <a href="http://{substring-before(substring-after(., 'http://'), ' ')}">

                <xsl:value-of select="substring-before(substring-after(., 'http://'), ' ')"/>

            </a>

            <xsl:text> </xsl:text>

            <xsl:value-of select="substring-after(substring-after(., 'http://'), ' ')"/>

        </xsl:when>

        <xsl:otherwise>

            <xsl:value-of select="."/>

        </xsl:otherwise>

    </xsl:choose>

 </xsl:template>

</xsl:stylesheet>

Open in new window

0
 

Author Comment

by:greatseats
ID: 24355921
Gertone,

Thanks - this is a great start.  We only have one instance of "http" per text block.  I will try to implement what you have given me and see how it goes.
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 24356071
OK, so no recursion required,
then the only issue will be to find the end of the link
0
 

Author Closing Comment

by:greatseats
ID: 31580164
Gertone,

Your solution worked great.  I made some slight modifications to take into account instances where the URL is at the end of the string and when it is immediately followed by a period, as you mentioned.

Thanks!
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 24357979
welcome
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Expert Comment

by:jones1618
ID: 24359574
I agree Gertone's solution is a good start. However, AFIK it doesn't handle your original case where some text precedes the URL in the element.

So, you might want to try the following which handles empty elements, elements with no URLs, elements with text before or after the URL, and elements with multiple links.
<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="MakingLinks-XSL1.xsl"?>

<xml>

	<mydata descrip="-empty-"      ></mydata>

	<mydata descrip="no link"      >nothing to see here</mydata>

	<mydata descrip="prefix + link">This goes to Google http://google.com</mydata>

	<mydata descrip="link only"    >http://yahoo.com</mydata>

	<mydata descrip="link + suffix">http://msn.com Microsoft</mydata>

	<mydata descrip="two links"    >Tweet http://twitter.com in your face http://facebook.com</mydata>

</xml>
 

<?xml version='1.0'?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xdt="http://www.w3.org/2005/xpath-datatypes">

  <xsl:template match="/">

    <xsl:apply-templates />

  </xsl:template>

  <xsl:template match="xml">

    <html>

    <head>

      <title>Turn URLs into Links</title>

      <style type="text/css">

        body { font-family: Verdana, Times, Serif; }

      </style>

    </head>

    <body>

       <table border="1">

     	<xsl:for-each select="mydata">

        	<TR>

        	<TD align="center"><b><xsl:value-of select="@descrip"/></b></TD>

        	<TD>

			<xsl:call-template name="MakeLinks" >

                    		<xsl:with-param name="text"><xsl:value-of select="text()"/></xsl:with-param>

                    	</xsl:call-template>

        	</TD></TR>

        </xsl:for-each>

     </table>

    </body>

    </html>

  </xsl:template>

  

<xsl:template name="MakeLinks">

	<xsl:param name="text" />

	<xsl:variable name="prefix"       select="substring-before($text, 'http://')" />

	<xsl:variable name="urlremaining" select="substring-after($text, 'http://')" />

	<!-- ==== OUTPUT EVERYTHING UP TO "HTTP" (IF ANY) ============== -->

	<xsl:value-of select="$prefix" />

	<xsl:choose>

	  <!-- ===== NO URL : OUTPUT TEXT ============================== -->

	  <xsl:when test="not($urlremaining)">

		<xsl:value-of select="$text" />

	  </xsl:when>

	  <!-- ===== SPACE AFTER URL: OUTPUT LINK AND PARSE REMAINING == -->

	  <xsl:when test="contains($urlremaining,' ')">

		<xsl:variable name="url" select="substring-before($urlremaining,' ')" />

		<xsl:variable name="remaining" select="substring-after($urlremaining,$url)" />

		<a href="http://{$url}"><xsl:value-of select="$url" /></a>

		<xsl:if test="$remaining">

			<xsl:call-template name="MakeLinks">

			  <xsl:with-param name="text" select="$remaining" />

			</xsl:call-template>

		</xsl:if>

	  </xsl:when>

	  <!-- ===== URL BUT NO SPACE: OUTPUT REST AS LINK ============= -->

	  <xsl:otherwise>

		<xsl:variable name="url" select="$urlremaining" />

		<xsl:variable name="remaining" select="''" />

		<a href="http://{$url}"><xsl:value-of select="$url" /></a>

	  </xsl:otherwise>

	</xsl:choose>

</xsl:template>
 

 </xsl:stylesheet>

Open in new window

0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 24359770
Hi jones1618, welcome to EE

> However, AFIK it doesn't handle your original case where some text precedes the URL in the element.

Oh yes, it does.
            <xsl:value-of select="substring-before(., 'http://')"/>

The only thing your solution adds is the recursion for multiple urls, which I asked for, it was not a requirement
My stylesheet worked on the original example, a part from the fact that the end-of-sentence "." will be part of the url, which is an error. Your solution has the same problem by the way on the original example, and that is a lot harder to solve genericaly then multiple urls.

If you allow me, I have a little remark on your XSLT
            <xsl:call-template name="MakeLinks" >
                                <xsl:with-param name="text"><xsl:value-of select="text()"/></xsl:with-param>
                          </xsl:call-template>
should better be
            <xsl:call-template name="MakeLinks" >
                                <xsl:with-param name="text"><xsl:value-of select="."/></xsl:with-param>
                          </xsl:call-template>
If there are multiple text nodes (and you can't rely on the fact that there aren't, specially being aware of the cute things msxml does with white-space) text() would only pass the first, potentially an white-space only node... it is a minor detail, but hard to debug

Have fun on EE, we all do

Geert
0
 

Expert Comment

by:jones1618
ID: 24366841
Greatseats,

Handling trailing periods shouldn't be too hard. After getting the URL, just add an xsl:choose clause that truncates the URL with substring if it ends-with '.'  and appends a '.' after the link. Otherwise output the whole URL as now.

Gertone,

Thanks for your suggestion about using "." instead of "text()". Good catch.

Clearly your XSL skills are deeper than mine, but in this case, when I tested your quick solution it fell down for two test cases: 1) where mydata contains text preceding a URL and 2) where mydata contains nothing but a URL. In general, your XSL only works if there's a space after the URL.

Try the attached XSL which wraps your original xsl:choose statement in an HTML table to show the results. Just apply it to the six XML test cases attached to my previous comment. You'll see that It fails for two test cases. My XSL handles those cases (along with multiple links which wasn't strictly required, as you said).
<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

   <xsl:template match="xml">

    <html>

    <head>

      <title>Turn URLs into Links</title>

      <style type="text/css">

        body { font-family: Verdana, Times, Serif; }

      </style>

    </head>

    <body>

       <table border="1">

     	<xsl:for-each select="mydata">

        	<TR>

        	<TD align="center"><b><xsl:value-of select="@descrip"/></b></TD>

        	<TD>

<!-- ==== Gerton's original code ================================ -->

		    <xsl:choose>

			<xsl:when test="contains(., 'http://')">

			    <xsl:value-of select="substring-before(., 'http://')"/>

			    <a href="http://{substring-before(substring-after(., 'http://'), ' ')}">

				<xsl:value-of select="substring-before(substring-after(., 'http://'), ' ')"/>

			    </a>

			    <xsl:text> </xsl:text>

			    <xsl:value-of select="substring-after(substring-after(., 'http://'), ' ')"/>

			</xsl:when>

			<xsl:otherwise>

			    <xsl:value-of select="."/>

			</xsl:otherwise>

		    </xsl:choose>

<!-- ============================================================ -->

        	</TD></TR>

        </xsl:for-each>

     </table>

    </body>

    </html>

 </xsl:template>

</xsl:stylesheet>

Open in new window

0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 24368074
yep, I know it has flaws, think I indicated that with my proposal, I appreciate your suggestions though

you are right that stripping the . is easy, but there is also : ; , ! ? etc.
developing a full-blown true parser for urls is difficult and not a straightforward task in XSLT1

The good thing about this discussion is that it shows to future readers of this thread that what I posted as a
"here is a first stab at it" is NOT a definite answer to the generic question

thanks
0
 

Expert Comment

by:jones1618
ID: 24369786
Attached is my final take on the more general problem. Features:

1. URLs can appear anywhere in the element (first, last or middle)
2. Elements may contain multiple URLs
3. Trailing .,?! characters are not included in URLs. So Google.com! becomes <Google.com>!
4. Elements can include other HTML.elements (In Internet Explorer not Firefox, unfortunately.)
<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="MakingLinks-XSL1.xsl"?>

<xml>

	<mydata descrip="-empty-"      ></mydata>

	<mydata descrip="no link"      >nothing to see here</mydata>

	<mydata descrip="prefix + link">All my search are belong to http://google.com</mydata>

	<mydata descrip="link only"    >http://yahoo.com</mydata>

	<mydata descrip="link + suffix">http://msn.com? That's a Microsoft thing.</mydata>

	<mydata descrip="link + comma"><![CDATA[The <b>best</b> site is http://newsmap.jp, isn't it?]]></mydata>

	<mydata descrip="two links"    ><![CDATA[Sweet tweet http://twitter.com. <br />In your <b>face</b> http://facebook.com!]]></mydata>

</xml>
 

<?xml version='1.0'?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" xmlns:xdt="http://www.w3.org/2005/xpath-datatypes">
 

<xsl:template match="*">

    <xsl:apply-templates/>

</xsl:template>
 

  <xsl:template match="xml">

    <html>

    <head>

      <title>Turn URLs into Links</title>

      <style type="text/css">

        body { font-family: Verdana, Times, Serif; }

      </style>

    </head>

    <body>

       <table border="1">

     	<xsl:for-each select="mydata">

        	<TR>

        	<TD align="center"><b><xsl:value-of select="@descrip"/></b></TD>

        	<TD>

			<xsl:call-template name="ReplaceLinks" >

                    		<xsl:with-param name="text"><xsl:value-of select="."/></xsl:with-param>

                    	</xsl:call-template>

        	</TD></TR>

        </xsl:for-each>

     </table>

    </body>

    </html>

  </xsl:template>

  

<xsl:template name="ReplaceLinks">

	<xsl:param name="text" />

	<xsl:variable name="prefix"       select="substring-before($text, 'http://')" />

	<xsl:variable name="urlremaining" select="substring-after($text, 'http://')" />

	<!-- ==== OUTPUT EVERYTHING UP TO "HTTP" (IF ANY) ============== -->

	<xsl:value-of select="$prefix" disable-output-escaping="yes"/>

	<xsl:choose>

	  <!-- ===== NO URL : OUTPUT TEXT ============================== -->

	  <xsl:when test="not($urlremaining)">

		<xsl:value-of select="$text" disable-output-escaping="yes"/>

	  </xsl:when>

	  <!-- ===== SPACE AFTER URL: OUTPUT LINK AND PARSE REMAINING == -->

	  <xsl:when test="contains($urlremaining,' ')">

		<xsl:variable name="url" select="substring-before($urlremaining,' ')" />

		<xsl:variable name="remaining" select="substring-after($urlremaining,$url)" />

		<xsl:call-template name="WriteLink" >

			<xsl:with-param name="url"><xsl:value-of select="$url"/></xsl:with-param>

		</xsl:call-template>

		<xsl:if test="$remaining">

			<xsl:call-template name="ReplaceLinks">

			  <xsl:with-param name="text" select="$remaining" />

			</xsl:call-template>

		</xsl:if>

	  </xsl:when>

	  <!-- ===== URL BUT NO SPACE: OUTPUT REST AS LINK ============= -->

	  <xsl:otherwise>

		<xsl:variable name="url" select="$urlremaining" />

		<xsl:variable name="remaining" select="''" />

		<xsl:call-template name="WriteLink" >

			<xsl:with-param name="url"><xsl:value-of select="$url"/></xsl:with-param>

		</xsl:call-template>

	  </xsl:otherwise>

	</xsl:choose>

</xsl:template>
 

<xsl:template name="WriteLink">

	<xsl:param name="url" />

	<xsl:variable name="lastc" select="substring($url,string-length($url),1)" />

	<xsl:choose>

		<xsl:when test="$lastc='.' or $lastc=',' or $lastc='?' or $lastc='!'">

			<a target="_blank" href="http://{substring($url,1,string-length($url)-1)}"><xsl:value-of select="substring($url,1,string-length($url)-1)" /></a><xsl:value-of select="$lastc" />

		</xsl:when>

		<xsl:otherwise>

			<a target="_blank" href="http://{$url}"><xsl:value-of select="$url" /></a>

		</xsl:otherwise>

	</xsl:choose>

</xsl:template>
 

</xsl:stylesheet>

Open in new window

0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
XML error 2 41
VB.NET and XML parsing 6 55
Problem to ToolkitScriptManager 2 30
C# SQL BULK INSERT CLASS 5 35
Preface This article introduces an authentication and authorization system for a website.  It is understood by the author and the project contributors that there is no such thing as a "one size fits all" system.  That being said, there is a certa…
Shoutout to Emily Plummer (http://www.experts-exchange.com/members/eplummer26.html) for giving me this article! She did most of it, I just finished it up and posted it for her :)    Introduction In a previous article (http://www.experts-exchang…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now