XSLT transform

Hello experts

Via VBA, I'm pulling XML data from an RSS feed wich looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<!--RSS généré par le Générateur RSS Microsoft SharePoint Foundation sur 26/01/2018 15:29:41 -->
<?xml-stylesheet type="text/xsl" href="/site/capsin/_layouts/15/RssXslt.aspx?List=e6440993-274f-494e-ae90-ff269711c914" version="1.0"?>
<rss version="2.0">
    <title>CAPS'IN : TdbTeamSite</title>
    <description>Flux RSS de la liste : TdbTeamSite.</description>
    <lastBuildDate>Fri, 26 Jan 2018 14:29:40 GMT</lastBuildDate>
    <generator>Générateur RSS Microsoft SharePoint Foundation</generator>
      <title>CAPS'IN, Portail Paiement du Groupe CA : TdbTeamSite</title>
      <title>P18017 - Sécurisation des applications SBP</title>
      <description><![CDATA[<div><b>Contact fonctionnel:</b> Jean-Marc.PARRY@overtheworld.com</div>
<div><b>Entité Fonctionnelle:</b> Anonymous Services</div>
<div><b>Direction fonctionnelle:</b> PROJET SI</div>
<div><b>Visibilité dans les annuaires:</b> private</div>
<div><b>WebApplicationScript:</b> https://www.anonymousserver.com</div>
<div><b>EtoileScript:</b> site</div>
<div><b>URLScript:</b> P18017</div>
<div><b>MasterPageScript:</b> No Masterpage</div>
<div><b>VolumeScript:</b> 250Mo</div>
<div><b>ModeleScript:</b> Site d&#39;équipe PROJET</div>
<div><b>CompatibilityLevel:</b> SP2013</div>
<div><b>EnvironnementScript:</b> Production</div>
<div><b>FeaturesToActivate:</b> Création de rapports, Flux de travail, Flux de travail à trois états</div>
<div><b>AdminsFonctionnelsScript:</b> ANONYMOUS_ADMIN</div>
<div><b>AdminsFonctionnelsEmailScript:</b> Admin@overtheworld.com</div>
<div><b>EmailGestionnaireScript:</b> Admin@overtheworld.com</div>
<div><b>VolumeProd:</b> 250Mo</div>
<div><b>URLProd:</b> https://www.anonymousserver.com/site/P18017</div>
<div><b>HistoriqueScript:</b> <div class="ExternalClassDAA4F1E103DE4130961DE580BFE6DDCC">
[25/01/2018 17&#58;01] https&#58;//www.anonymousserver.com/site/P18017 créé.</div></div>
<div><b>CréationProd:</b> 25/01/2018 17:01:13</div>
<div><b>FluxDeSiteID:</b> à_renseigner</div>
<div><b>Visiteurs Uniques Mensuels:</b> 0</div>
<div><b>Espace disque:</b> 0,014792274475098</div>
      <author>SEAN-PRESTATAIRE Manarak</author>
      <pubDate>Thu, 25 Jan 2018 15:39:51 GMT</pubDate>
      <guid isPermaLink="true">https://www.anonymousserver.com/DispForm.aspx?ID=724</guid>
</rss version="2.0">

Open in new window

note: There are serveral <item>....</item> element wich I don't display here for obvious reasons.
I'm interrested in extracting some datas and write them into a CSV file.
Data of interrest are the URLScript, URLProd and FluxDeSiteID tags within the CDATA block (they'll need to be translated to Title, Site and ID columns respectively).

I'm thinking about using an XSLT transform to achieve this goal, but I'm unsure if it is doable with a XSL stylesheet.

I'll appreciate any help or insight.
LVL 15
Fabrice LambertConsultingAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Gertone (Geert Bormans)Information ArchitectCommented:
I would highly recommend using XSLT for this.
What XSLT is and the benefits of it, for that you can read the wikipedia entry about it, which is closely monitored by the XSLT community for correctness (I recommend the English over the French entry)

Why I would use XSLT over other technologies to achieve this?
The rules based approach of XSLT allows you to walk the XSLT tree (without any effort) and at each level you can decide what to keep and what to throw away, simply by being selective in the xsl:apply templates and just by writing the correct rules (xsl:template attribute match) for processing what you need to be processed

I will show an example later today
Gertone (Geert Bormans)Information ArchitectCommented:
this will work

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     <xsl:template match="rss">
                    <xsl:apply-templates select="channel"></xsl:apply-templates>
    <xsl:template match="channel">
        <table border="1">
            <xsl:apply-templates select="item"></xsl:apply-templates>
    <xsl:template match="item">
            <xsl:apply-templates select="description"/>
    <xsl:template match="description">
        <xsl:call-template name="description-field-to-cell">
            <xsl:with-param name="fieldname" select="'URLScript'"/>
        <xsl:call-template name="description-field-to-cell">
            <xsl:with-param name="fieldname" select="'URLProd'"/>
        <xsl:call-template name="description-field-to-cell">
            <xsl:with-param name="fieldname" select="'FluxDeSiteID'"/>
    <xsl:template name="description-field-to-cell">
        <xsl:param name="fieldname"/>
            <xsl:value-of select="normalize-space(substring-before(substring-after(., concat($fieldname, ':&lt;/b&gt;')), '&lt;/div'))"/>

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Gertone (Geert Bormans)Information ArchitectCommented:
The mechanism is as such
- an xsl:template is a piece of code that does something with a piece of the XML. The match attribute is the rule that says when the template gets triggered.
example:    <xsl:template match="item"> is triggered every time the process flow hits an item element
the elegance is in the fact that loops are not needed

- apply-templates pushes some XML nodes to the templates for evaluation
<xsl:apply-templates select="item"/> basically says
from all the childnodes this context (channel in case of the example) has, throw the "item" nodes to the rules evaluation
(thus triggering the rule for "item"

all the tags that are not in the xsl: namespace are basically element constructors in the output tree

Your question hits a flaw of the XSLT1 available in VBA, there are no regluar expressions,
so pulling the fields out of the description CDATA is a bit clumsy
but since the description is pretty predictable in structure, the substring works well enough

I have put the description processing in a named template (note the xsl:call template which basically is a function call)

I hope this helps
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

Fabrice LambertConsultingAuthor Commented:
Thanks for your input, this is a baginning to the right direction, except it output in HTML format and I'm looking for CSV format.
Still it was usefull.

After many trials and errors I came up with this wich match my needs and is easyer to update in case of update from outputed columns:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:csv="csv:csv" version="1.0">
  <xsl:output method="text" encoding="UTF-8"/>
    <!-- Delimiter -->
  <xsl:variable name="delimiter" select="','"/>
    <!-- New line -->
  <xsl:variable name="xsCrLf" select="'&#xD;&#xA;'"/>
    <!-- -->
  <xsl:variable name="div" select="'&lt;div&gt;'"/>
  <xsl:variable name="b" select="'&lt;b&gt;'"/>
  <xsl:variable name="divEnd" select="'&lt;/div&gt;'"/>
  <xsl:variable name="bEnd" select="'&lt;/b&gt;'"/>
    <!-- Columns list  -->
    <column name="URLScript">Title</column>
    <column name="URLProd">Site</column>
    <column name="FluxDeSiteID">ID</column>
    <!-- output row titles-->
  <xsl:template match="rss">
    <xsl:call-template name="rowTitles"/>
    <xsl:value-of select="$xsCrLf"/>
    <xsl:apply-templates select="channel"/>
  <xsl:template match="channel">
    <xsl:apply-templates select="item"/>
  <xsl:template match="item">
    <xsl:apply-templates select="description"></xsl:apply-templates>
    <xsl:value-of select="$xsCrLf"/>
  <xsl:template match="description">
    <xsl:variable name="data" select="."/>
    <xsl:for-each select="document('')/*/csv:columns/*">
      <xsl:call-template name="description-field-to-cell">
        <xsl:with-param name="fieldName" select="./@name"/>
        <xsl:with-param name="data" select ="$data"/>
      <xsl:if test="position() != last()">
        <xsl:value-of select="$delimiter"/>

  <xsl:template name="rowTitles">
    <xsl:for-each select="document('')/*/csv:columns/*">
        <xsl:value-of select="."/>
          <xsl:if test="position() != last()">
            <xsl:value-of select="$delimiter"/>
  <xsl:template name="description-field-to-cell">
    <xsl:param name="fieldName"/>
    <xsl:param name="data"/>
      <xsl:variable name="itemStart" select="concat($div, $b, $fieldName, ':', $bEnd)"/>
      <xsl:variable name="itemEnd" select="$divEnd"/>
      <xsl:value-of select="normalize-space(substring-before(substring-after($data, $itemStart), $itemEnd))"/>

Open in new window

Gertone (Geert Bormans)Information ArchitectCommented:
Well, i did the html on purpose. Html is an easier format to explain the concepts without worrying about quotes inside fields etc for csv. I had the intention of showing you the path, not the intention for delivering a production ready xslt :-)

Will check your xslt later today
Gertone (Geert Bormans)Information ArchitectCommented:
Looks good

       <xsl:value-of select="."/>
          <xsl:if test="position() != last()">
            <xsl:value-of select="$delimiter"/>
usually one does it like this
          <xsl:if test="not(position() = 1)">
            <xsl:value-of select="$delimiter"/>
        <xsl:value-of select="."/>

As a test that is a lot cheaper

Note that your csv will break on a ',' in your fields
Making a good CSV takes some more than adding ','
if there is a ',' in your field data you need " around the field, but then you need to replace each present '"' with '""'
Fabrice LambertConsultingAuthor Commented:
Thank you.
Gertone (Geert Bormans)Information ArchitectCommented:
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.