Solved

XSLT to strip file extension, perhaps with regex?

Posted on 2014-12-19
5
379 Views
Last Modified: 2014-12-19
I use the transform below to remove the '.pdf' file extension from processed files. I'd like to come up with a transform that will remove file extensions regardless of file type, e.g., .doc, .docx, .txt, .ppt, zip., .jpeg, etc etc etc    Can this be done via regex, perhaps? I don't know how to do this. Can someone show me how to adjust the code to deal with any file extension?

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:idoc="http://ns.inmagic.com/Presto/1.0/ContentConnector/DocumentParameters">
    <xsl:output omit-xml-declaration="no" indent="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
   
    <xsl:template match="idoc:ItemName[normalize-space(.)]">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:value-of select="replace( ., '.pdf', '' )"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Thanks.
0
Comment
Question by:GessWurker
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 40509345
Note that the translate() is not taking away the ".pdf" but is taking away the '.' the 'p' the 'd' and the 'f'

so the filename 'pdf.pdf' will be erased completely

replace() and regex are the best answer, but you have a XSLT1 and those are XSLT2 functionality.
What is your XSLT processor? Can you use XSLT2?

<xsl:value-of select="substring-before(., '.')"/>

is the easiest cut, but it will only work correctly if there is only one '.' in the file name
it will return 'foo' if the filename is 'foo.bar.pdf'

if you need a more solid solution, you will need recursion
but first tell me if you can use XSLT2
0
 

Author Comment

by:GessWurker
ID: 40509399
Yep. Using Saxon XSLT 2.0 processor
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 40509573
<xsl:value-of select="replace( ., '\.[^\.]+$', '')"/>

make sure you set the xslt version to 2.0
0
 

Author Comment

by:GessWurker
ID: 40509601
Thanks, Geert. Your suggestion is sufficiently robust for my application. Now I'll post a separate (but related) question.
0
 

Author Comment

by:GessWurker
ID: 40509603
Ah... I missed the regex part. Even better! And now I'll post another related question...
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction In my previous article (http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/A_9150-Loading-XML-Using-SSIS.html) I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question