Solved

XSLT to strip file extension, perhaps with regex?

Posted on 2014-12-19
5
357 Views
Last Modified: 2014-12-19
I use the transform below to remove the '.pdf' file extension from processed files. I'd like to come up with a transform that will remove file extensions regardless of file type, e.g., .doc, .docx, .txt, .ppt, zip., .jpeg, etc etc etc    Can this be done via regex, perhaps? I don't know how to do this. Can someone show me how to adjust the code to deal with any file extension?

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:idoc="http://ns.inmagic.com/Presto/1.0/ContentConnector/DocumentParameters">
    <xsl:output omit-xml-declaration="no" indent="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
   
    <xsl:template match="idoc:ItemName[normalize-space(.)]">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:value-of select="replace( ., '.pdf', '' )"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Thanks.
0
Comment
Question by:GessWurker
  • 3
  • 2
5 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40509345
Note that the translate() is not taking away the ".pdf" but is taking away the '.' the 'p' the 'd' and the 'f'

so the filename 'pdf.pdf' will be erased completely

replace() and regex are the best answer, but you have a XSLT1 and those are XSLT2 functionality.
What is your XSLT processor? Can you use XSLT2?

<xsl:value-of select="substring-before(., '.')"/>

is the easiest cut, but it will only work correctly if there is only one '.' in the file name
it will return 'foo' if the filename is 'foo.bar.pdf'

if you need a more solid solution, you will need recursion
but first tell me if you can use XSLT2
0
 

Author Comment

by:GessWurker
ID: 40509399
Yep. Using Saxon XSLT 2.0 processor
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 40509573
<xsl:value-of select="replace( ., '\.[^\.]+$', '')"/>

make sure you set the xslt version to 2.0
0
 

Author Comment

by:GessWurker
ID: 40509601
Thanks, Geert. Your suggestion is sufficiently robust for my application. Now I'll post a separate (but related) question.
0
 

Author Comment

by:GessWurker
ID: 40509603
Ah... I missed the regex part. Even better! And now I'll post another related question...
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
XML Data Missing in PHP SimpleXML 8 83
XML to Excel using XSL - formatting 3 42
ebay devID, appID, certID, userToken 2 81
C# XML Get Values 4 33
The Problem How to write an Xquery that works like a SQL outer join, providing placeholders for absent data on the outer side?  I give a bit more background at the end. The situation expressed as relational data Let’s work through this.  I’ve …
The Confluence of Individual Knowledge and the Collective Intelligence At this writing (summer 2013) the term API (http://dictionary.reference.com/browse/API?s=t) has made its way into the popular lexicon of the English language.  A few years ago, …

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question